Qxf2 is exploring commonly used Rust crates and sharing our learning. We are focused on crates that testers will end up using quite often. We have tried our best to include illustrative examples in order to help testers understand how to use these crates. This blog is continuation to our previous blog on Regex Crate Modules. In this blog we are covering the below modules with examples:
- Regex::is_match
- Regex::captures
- Regex::captures_iter
- Regex::captures_len
Regex::is_match:
The is_match() method in Rust allows you to check whether a given regular expression pattern matches any part of a text string. The method returns a boolean value, true if there is at least one match, and false otherwise.
For example, if the pattern is r”(\d+)” and the text is “The answer is 42”, then is_match() will return true, because the text has digits.
The following example use is_match() to extract mobile number from the provided string.
/* is_match: Checks if the regex pattern matches the entire text The below patter checks for the mobile number validation using is_match. */ use regex::Regex; use std::io; fn prompt_user_input(prompt: &str) -> String { println!("{}", prompt); let mut input = String::new(); match io::stdin().read_line(&mut input) { Ok(_) => input.trim().to_string(), Err(error) => { eprintln!("Failed to read line: {}", error); std::process::exit(1); } } } fn mobile_number_validation(s: String) -> Result<String, regex::Error> { let pattern = r"\+91[-\s]?\d{10}"; // Pattern to match mobile number match Regex::new(pattern) { Ok(re) => { if re.is_match(&s) { Ok(("Valid Mobile Number").to_string()) } else { Ok(("Invalid Mobile Number").to_string()) } } Err(err) => Err(err), } } fn main() { let mobile_number = prompt_user_input("Please Input Mobile Number (like +91-XXXXXXXXXX):"); println!("Mobile Number : {}", mobile_number); match mobile_number_validation(mobile_number) { Ok(re) => { println!("{}", re); //println!("The Mobile Number: {}", mobile_number); } Err(error) => { eprintln!("Failed to read line: {}", error); } } } |
Output of the program:
Regex::captures:
The captures() module is a method of the Regex type that returns a Captures object. A Captures object contains information about the match of a regex in a string, such as the start and end positions of the match and each capture group. You can use the captures() module to extract parts of a string that match a regex pattern.
For example, if you have a regex like r”(\d+)-(\w+)” and a string like “123-abc”, you can use the captures() module to get the numbers and the letters separately. The Captures object will have three elements: the whole match “123-abc”, the first capture group “123”, and the second capture group “abc”.
The following example uses captures() module to extract the valid date, month, and year from the given date string.
/* captures: Capture returns an Option containing capture groups if the pattern matches the date. The below regex pattern captures the date in dd/mm/yyyy. */ use regex::Regex; use std::error::Error; // Leap year validation function. fn is_leap_year(year: u32) -> bool { (year % 4 == 0 && year % 100 != 0) || (year % 400 == 0) } fn validate_date(date: &str) -> Result<String, Box<dyn Error>> { // Declaring the pattern and validating the pattern let pattern = r"(\d{1,2})/(\d{1,2})/(\d{4})"; // Pattern to match dates let re = match Regex::new(pattern) { Ok(re) => re, Err(_) => return Err("Invalid regex pattern".into()), }; if let Some(captures) = re.captures(date) { if let (Some(day), Some(month), Some(year)) = ( captures.get(1).and_then(|m| m.as_str().parse::<u32>().ok()), captures.get(2).and_then(|m| m.as_str().parse::<u32>().ok()), captures.get(3).and_then(|m| m.as_str().parse::<u32>().ok()), ) { let days_in_month = match month { 1 | 3 | 5 | 7 | 8 | 10 | 12 => 31, 4 | 6 | 9 | 11 => 30, 2 if is_leap_year(year) => 29, 2 => 28, _ => { return Err("Invalid month".into()); } }; if day > 0 && day <= days_in_month { return Ok(format!("Day: {}, Month: {}, Year: {}", day, month, year)); } else { return Err("Invalid day for the given month and year".into()); } } } Err("Invalid date format, provide correct input".into()) } fn main() { // Change the date to check different date validations. Valid pattern dd/mm/yyyy or d/m/yyyy let date = "28/15/2025"; match validate_date(date) { Ok(validated_date) => println!("{}", validated_date), Err(err) => println!("Error: {} \nDate: {}", err,date), } } |
Output of the program:
Regex::captures_iter:
The captures_iter() module is a method of the Regex type that returns an iterator over all the Captures objects in a string. An iterator is a way of looping over a collection of items one by one. A Captures object contains information about the match of a regex in a string, such as the start and end positions of the match and each capture group. You can use the captures_iter() module to extract parts of a string that match a regex pattern multiple times.
For example, if you have a regex like r”(\d+)-(\w+)” and a string like “123-abc 456-def 789-ghi”, you can use the captures_iter() module to get the numbers and the letters separately for each match. The iterator will have three elements, each one a Captures object. The first Captures object will have three elements: the whole match “123-abc”, the first capture group “123”, and the second capture group “abc”. The second Captures object will have three elements: the whole match “456-def”, the first capture group “456”, and the second capture group “def”. The third Captures object will have three elements: the whole match “789-ghi”, the first capture group “789”, and the second capture group “ghi”.
The following example uses captures_iter() module to extract the IP address from the provided vector and validates.
/* captures_iter: allows you to iterate over multiple captures of a regular expression pattern in a text. The below example captures the valid IP address from the provided text and prints it. */ use regex::{Error, Regex}; fn validate_ips(texts: Vec<&str>) -> Result<Vec<String>, Error> { let mut valid_ips = Vec::new(); let pattern = match Regex::new(r"\b(?:\d{1,3}\.){0,9}\d{1,3}\b") { Ok(pattern) => pattern, Err(err) => return Err(err), }; let mut ip_found = false; // Flag to check if any IP is found in the texts for text in texts { let text_has_valid_ip = process_text(&mut valid_ips, &pattern, text); if !text_has_valid_ip { println!("No valid IP address found in the text: '{}'\n", text); } else { ip_found = true; } } if !ip_found { println!("No valid IP addresses found in the texts vector."); } Ok(valid_ips) } fn process_text(valid_ips: &mut Vec<String>, pattern: &Regex, text: &str) -> bool { let mut text_has_valid_ip = false; // Flag to check if any valid IP is found in the current text for captures in pattern.captures_iter(text) { if let Some(ip) = captures.get(0).map(|m| m.as_str()) { if is_valid_ip(ip) { if !valid_ips.contains(&ip.to_string()) { valid_ips.push(ip.to_string()); } text_has_valid_ip = true; } } } text_has_valid_ip } fn is_valid_ip(ip: &str) -> bool { let octets: Vec<&str> = ip.split('.').collect(); if octets.len() != 4 { println!("Invalid IP: {} (Wrong number of octets)", ip); return false; } for octet in &octets { match octet.parse::<u8>() { Ok(num) => { // Your code when parsing succeeds {} }, Err(_) => { // Your code when parsing fails println!("Invalid IP: {} (Octet value out of range: {})", ip, octet); return false; } } } true } fn main() { let texts = vec![ "192.168.0.1 is the router's IP address.", "The server's IP is 10.0.0.11.12", // No valid IP in this text "300.200.100.1", "Another valid IP is 172.16.254.1", "10.11", "11", "ab.bc.da.xy", "190.350.10.11", "The IP: 10.15.20.21", ]; match validate_ips(texts) { Ok(valid_ips) => { if valid_ips.is_empty() { println!("No valid IPs found"); } else { println!("\nValid IPs:"); for ip in valid_ips { println!("{}", ip); } } } Err(err) => { println!("Regex error: {}", err); } } } |
Output of the program:
Regex::captures_len:
The captures_len() module is a method of the Regex type that returns the number of capture groups in a regex. A capture group is a part of a regex pattern that can be extracted from a match. Capture groups are usually marked by parentheses in the regex syntax.
For example, the regex r”(\d+)-(\w+)” has two capture groups: one for the digits and one for the letters.
The captures_len() module can be useful to check how many capture groups a regex has before using other methods like captures() or captures_iter(). It can also be used to iterate over all the capture groups in a match. The captures_len() module always returns at least one, because the whole regex is considered as a capture group.
The following example uses captures_len() along with captures_iter() to extract Timestamp,Log Level and message from the given log entry.
/* captures_len: returns the number of capturing groups in a regular expression pattern. The below program uses capture len to get the count to verify the structure of captured data and ensure it matches the expected format. */ use regex::Regex; fn capture_info_from_log_entry(log_entries: Vec<&str>) -> Vec<Result<(String, String, String), String>> { // Define a regex pattern to capture timestamp, log level, and message let pattern = r"\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\] \[([A-Z,a-z]+)\] (.+)"; let re = match Regex::new(pattern) { Ok(re) => re, Err(_) => return vec![Err("Invalid regex pattern".to_string()); log_entries.len()], }; log_entries .iter() .map(|log_entry| { // Count the number of capture groups let captures_len = re.captures_len(); match captures_len { 4 => { if let Some(captures) = re.captures(log_entry) { let timestamp = captures.get(1).map(|m| m.as_str().to_string()); let log_level = captures.get(2).map(|m| m.as_str().to_string()); let message = captures.get(3).map(|m| m.as_str().to_string()); match (timestamp, log_level, message) { (Some(timestamp), Some(log_level), Some(message)) => { Ok((timestamp, log_level, message)) }, _ => Err("No match found".to_string()) } } else { Err("Expected 3 capture groups".to_string()) } }, _ => Err("Invalid regex pattern. Expected 3 capture groups.".to_string()), } }) .collect() } fn main() { // Sample log entries let log_entries = vec![ "[2023-10-31 15:23:45] [ERROR] *370 connect() failed (111: Unknown error) while connecting to upstream, client: 135.125.246.189, server: _, request: \"GET /.env HTTP/1.1\", upstream: \"http://127.0.0.1:5000/.env\", host: \"18.118.196.200\"", "[2023-10-31 15:30:00] [INFO] Application started successfully", "[2023-10-31 15:40:22] [WARNING] Unrecognized log entry format", "[2023-01-01 00:00:00] [info] ", " [error] some text", " hi" // Add more log entries for testing ]; let results = capture_info_from_log_entry(log_entries); for (index, result) in results.iter().enumerate() { match result { Ok((timestamp, log_level, message)) => { println!("Log Entry {}: ", index + 1); println!("Timestamp: {}", timestamp); println!("Log Level: {}", log_level); println!("Message: {}", message); println!(); }, Err(err) => { println!("Error in Log Entry {}: {}", index + 1, err); println!(); } } } } |
Output of the program:
In this blog we tried to cover few more Regex Crate modules like is_match(), capture(), capture_iter(), and capture_len() for pattern matching and extraction with examples. These methods enable efficient text processing, allowing validation, extraction, and manipulation of data from strings.
Hire technical testers from Qxf2
Qxf2 is the home for technical testers. We employ experienced testers with a technical bent of mind. Our testers are naturally inclined towards learning new things and take the time to share their learnings on this blog. Additionally, as a company we invest in the practical development of all our employees. For example, since 2023, we have tried to get everyone to learn and use Rust. This post is an outcome of a couple of our testers having used Rust’s regex crate in their daily activities. If you want to work with technical test engineers in your project, please get in touch with us.
I love technology and learning new things. I explore both hardware and software. I am passionate about robotics and embedded systems which motivate me to develop my software and hardware skills. I have good knowledge of Python, Selenium, Arduino, C and hardware design. I have developed several robots and participated in robotics competitions. I am constantly exploring new test ideas and test tools for software and hardware. At Qxf2, I am working on developing hardware tools for automated tests ala Tapster. Incidentally, I created Qxf2’s first robot. Besides testing, I like playing cricket, badminton and developing embedded gadget for fun.