A guide to Regex Crate

Qxf2 is exploring commonly used Rust crates. We are writing a series of posts on useful crates to help you to understand how to use them with examples. This is blog is about the Rust Regex Crate.

Disclaimer: We are not developers. But we make it a point to share out learning. This post was worked on in late 2023. Depending on when you are reading this, standards for writing good Rust programs might have evolved.


Introduction to regex crate:

The Rust Regex Crate is a powerful tool for working with regular expressions. It helps you search, match, and manipulate text patterns with ease. Regex Crate is used for validating user input or extracting data from strings. The Regex Crate simplifies the complex pattern-matching tasks in your code.
In this post we are providing the following Regex Crate method details with useful example that can be referred to understand the Regex Crate module better.

  • Regex::find
  • Regex::find_iter
  • Regex::replace
  • Regex::splitn

Prerequisite:

To include this Crate to your project, add “regex” to your “Cargo.toml” file or use the command

cargo add regex

Refer Cargo.toml file

[package]
name = "rust-motw-regex"
version = "0.1.0"
edition = "2021"
 
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
regex = "1.10.2"
ansi_term = "0.12.1"

Note: Used ansi_term crate is a library for colours and formatting.


Since our regex project is going to have multiple binaries we need to create an src/bin directory, where we will place our executable. Command to run the individual example from the terminal from project directory.

cargo run --bin <example filename>

Note: replace example filename with the actual name.


Regex::find:

The find() method in the Regex Crate actively searches for the first occurrence of a pattern in a text string. It returns a result that allows you to check if a match was found and obtain details about its position. This method is handy for pinpointing specific content in your text using regular expressions with simplicity and precision.

Refer the below example:
/* find(): searches for the first occurrence of a regex pattern in a string.
The below takes input sentence and search word from the user and prints the 
position for the search word.
To handle case insensitive used RegexBuilder
*/
use regex::RegexBuilder;
use std::io;
use ansi_term::Colour::{Red, Green};
 
fn find_word_in_sentence(sentence: &str, search_word: &str) -> Result<String, regex::Error> {
    // Create a Regex object for the pattern
    let mut binding = RegexBuilder::new(search_word);
    let regex_builder = binding.case_insensitive(true);
 
    // Check if the regex compilation was successful
    let re = match regex_builder.build() {
        Ok(re) => re,
        Err(err) => {
            eprintln!("Error: {}", err);
            return Err(err);
        }
    };
 
    // Search for the pattern in the sentence
    if let Some(mat) = re.find(sentence) {
        let matched_text = mat.as_str();
        let start = mat.start();
        let end = mat.end();
 
        let word_position = format!("Found '{}' at positions {}-{}", matched_text, start, end);
        Ok(word_position)
    } else {
        let word_position = "Word not found in the provided sentence. ".to_string();
        Ok(word_position)
    }
}
 
fn prompt_user_input(prompt: &str) -> String {
    println!("{}", prompt);
    let mut input = String::new();
    match io::stdin().read_line(&mut input) {
        Ok(_) => input.trim().to_string(),
        Err(error) => {
            eprintln!("Failed to read line: {}", error);
            std::process::exit(1);
        }
    }
}
// Program starting point
fn main() {
    let sentence = prompt_user_input("Input Sentence:");
    println!("Sentence: {}", sentence);
 
    let search_word = prompt_user_input("Enter Search word from the input sentence:");
    println!("Search Word: {}", search_word);
 
    // Validate input
    if sentence.is_empty() || search_word.is_empty() {
        println!("{}",Red.paint("Error: Input cannot be empty."));
        return;
    }
    match find_word_in_sentence(&sentence, &search_word) {
        Ok(word_position) => {
            println!("{}", Green.paint(word_position));
        }
        Err(err) => println!("Error: {}", err),
    }
}
Check the output of the above code:

Regex_find


Regex::find_iter:

The find_iter() method in the Regex Crate actively searches for all occurrences of a pattern in a text string. It returns an iterator that enables you to iterate over each match and retrieve details about their positions. This method is useful for handling multiple matches within your text using regular expressions with ease and efficiency.

Refer the below example:
/*
The below code uses find_iter() method.
In the provided text the script searches for email address pattern
*/
 
use regex::Regex;
use std::error::Error;
use ansi_term::Colour::{Green};
 
fn search_email_pattern_from_sentence(text: &str) -> Result<Vec<String>, Box<dyn Error>> {
    let mut emails = Vec::new(); // Store the found email addresses
 
    // Define a valid regex pattern for email address
    let pattern = r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b";
 
    // Create a Regex object
      let re = match Regex::new(pattern) {
        Ok(re) => re,
        Err(err) => return Err(err.into()), // Convert the regex error to Box<dyn Error>
    };
 
    // Find all matches in the text
    for mat in re.find_iter(text) {
        emails.push(mat.as_str().to_string()); // Store the found email addresses
    }
 
    if emails.is_empty() {
        return Err("No email addresses found in the text.".into()); // Return an error if no emails are found
    }
 
    Ok(emails) // Return the found email addresses
}
 
fn main() {
    let text = "Contact us at [email protected] or [email protected] or invalid.com for assistance.";
    println!("The text: {} ", text);
 
    match search_email_pattern_from_sentence(text) {
        Ok(emails) => {
            // Print the matches
            println!("====================Email from text=============================");
            for email in emails {
                println!("{}", Green.paint(email));
            }
        }
        Err(err) => {
            println!("{}", err);
        }
    }
}
Check the output of the above code:

Regex_find_iter


Regex::replace_all

The replace_all() method in the Regex Crate actively replaces all occurrences of a pattern in a text string with a specified replacement. It provides a convenient way to globally modify your text based on regular expression matches, ensuring that every instance of the pattern is replaced throughout the entire string.

Refer the below example:
/*
The below code uses Regex:replace_all() method.
The program prompts user to enter sentence, then word to replace and replacement word
 */
use regex::Regex;
use std::io;
use ansi_term::Colour::{Red, Blue, Green, Yellow};
 
fn prompt_user_input(prompt: &str) -> String {
    println!("{}", prompt);
    let mut input = String::new();
    match io::stdin().read_line(&mut input) {
        Ok(_) => input.trim().to_string(),
        Err(error) => {
            eprintln!("Failed to read line: {}", error);
            std::process::exit(1);
        }
    }
}
 
fn replace_text(sentence: &str, word_to_replace: &str, replacement_word: &str) -> Result<String, regex::Error> {
    // Create a Regex object for the word to be replaced
    let re = match Regex::new(&format!(r"\b{}\b", regex::escape(word_to_replace))) {
        Ok(re) => re,
        Err(err) => return Err(err),
    };
 
    // Replace all occurrences of the word
    let replaced_sentence = re.replace_all(sentence, replacement_word).to_string();
    Ok(replaced_sentence)
}
 
// Program starting point
fn main() {
    let sentence = prompt_user_input("Enter a Sentence:");
    println!("Sentence: {}", Blue.bold().paint(&sentence));
 
    let word_to_replace = prompt_user_input("Enter the word to replace:");
    println!("word to replace: {}", Red.bold().paint(&word_to_replace));
 
    let replacement_word = prompt_user_input("Enter the replacement word:");
    println!("Enter replacement word: {}", Green.bold().paint(&replacement_word));
 
    match replace_text(&sentence, &word_to_replace, &replacement_word) {
        Ok(replaced_sentence) => {
            println!("Modified sentence: {}", Yellow.bold().paint(&replaced_sentence));
        }
        Err(err) => eprintln!("Error replacing text: {}", err),
    }
}
Check the output of the above code:

Regex_replace_all


Regex::splitn

The splitn() method in the Regex Crate actively divides a text string into parts based on a specified pattern. It creates an iterator that allows you to loop through the separated segments. This method is useful for breaking down a string into meaningful components using regular expressions, providing flexibility in handling different parts of the text.

Refer the below example:
/* 
The below example uses splitn method.
The below example uses splitn to extract key details like product name, price, and description.
*/ 
use regex::Regex;
use std::error::Error; 
use ansi_term::Colour::{Red, Blue};
 
fn split_text(product_description: &str) -> Result<Vec<&str>, Box<dyn Error>> {
    let pattern = r"\|"; // Assuming "|" is the delimiter
    let re = match Regex::new(pattern) {
        Ok(re) => re,
        Err(e) => return Err(Box::new(e)), 
    };
 
        let parts: Vec<&str> = re.splitn(product_description, 3).collect();
        Ok(parts)
    }
 
fn main() {
    let product_description = "Widget | $29.99 | A high-quality widget for your needs.";
    println!("The Product Description: {}", Red.paint(product_description));
    println!("{}", Blue.paint("================The Split:Product/Price/Desc==================================="));
    match split_text(&product_description) {
        Ok(parts) => {
            if parts.len() >= 3 {
                let product_name = parts[0].trim();
                let price = parts[1].trim();
                let description = parts[2].trim();
 
                println!("Product Name: {}", product_name);
                println!("Price: {}", price);
                println!("Description: {}", description);
            } else {
                println!("Invalid product description format");
            }
        }
        Err(err) => {
            println!("Failed to split product description: {}", err);
            // You can take further actions here for error handling if needed.
        }
    }
}
Check the output of the above code:

Regex_splitn


Hope the above provided Regex Crate methods with examples are useful for the understanding of the Regex crate. We explored essential methods like find(), find_iter(), replace_all(), and splitn(), each offering unique capabilities. Whether you’re locating patterns, iterating through matches, globally replacing text, or segmenting strings, these methods simplify complex tasks.
In our next upcoming blog we are coming up with few more methods of Regex Crate with examples.


Hire technical testers from Qxf2

Qxf2 hires technical testers who are ahead of the curve. We go out of our way to stay in touch with early market trends. As you can see from this post, we have transitioned to learning, using and sharing Rust code well before Rust has become mainstream. If you are looking for highly technical engineers who excel at testing, please get in touch with us.


Leave a Reply

Your email address will not be published. Required fields are marked *