Read and Write csv data using Rust

This post will discuss how to read and write csv data using Rust and the csv crate. We are aiming this at testers who are trying to do useful things with Rust just like us at Qxf2. The two most common operations a tester working with csv would like to do are: reading csv data and dynamically creating csv data. This maps to reading and writing csv files.

Note: As testers, we are showing you how to work with the most primitive data structure (Vec<Vec<String>>). If you figure out how to work with this data structure, the other (more real) data structures you come across will be much easier to deal with.

Complete snippet of the read and write function

For folks who would rather look at code than read a post, you can find the complete code here: gist.

1. Read Sample Code

In this example we are going to use the ReaderBuilder struct in csv crate to parse the csv data.
We have a csv file in our current directory which we are going to use as a file path. ReaderBuilder also allow us to set the header which allow us to set the first row in the csv as headers. The headers takes a bool type.

1.1 Creating a csv handler

We first make a ReaderBuilder and pass the file path and set the header.

let reader_result = ReaderBuilder::new().has_headers(is_header_present).from_path(csv_path);
let reader = match reader_result {
    Ok(reader) => reader,
    Err(err) => return Err(Box::new(err)),
};
1.2 Iterate through the reader

We are using Vec<Vec<String>> to store the csv data. Once you create the ReaderBuilder, we iterate over the reader in order to read the records.

for record in reader.into_records() {
    let record = match record {
        Ok(record) => record,
        Err(err) => {
            if flag_ignore_error {
                continue;
              } else {
                  return Err(Box::new(err));
              }
        }
    };
}

If you have noticed, we have used a flag_ignore_error. This decision lies with the testers as to how he/she wants to handle the error. When encountered, Should you just return the error or give the user a choice to decide to ignore it and read the records that follows? In our case, we wanted to handle the error and hence we have set the flag_ignore_error in the function signature. If the flag is set to true, the encountered record error is ignored, and the reading of the next records continues. However, if the flag is set to false, an error is returned, and the execution is stopped.

1.3 Breaking records into columns

Once, the records are read, we break the records into columns and store it as a Vec.

let row: Vec = record
    .iter()
    .map(|field| field.trim().to_string())
    .collect();

The record is iterated and field, which is a reference to each element trims any leading or trailing whitespaces from each element, converts them to string and collects them into a new vector row.

1.4 Storing each row

Finally, we store each row in the csv data.

csv_data.push(row);
1.5 Putting it all together

The complete snippet of the read function looks like this.

fn read_csv_details(
    csv_path: &str,
    flag_ignore_error: bool,
    is_header_present: bool,
) -> Result<Vec<Vec<String>>, Box<dyn std::error::Error>> {
 
    let reader_result = ReaderBuilder::new().has_headers(is_header_present).from_path(csv_path);
    let reader = match reader_result {
        Ok(reader) => reader,
        Err(err) => return Err(Box::new(err)),
    };    
 
    let mut csv_data: Vec<Vec<String>> = Vec::new();
 
    for record in reader.into_records() {
        let record = match record {
            Ok(record) => record,
            Err(err) => {
                if flag_ignore_error {
                    continue;
                } else {
                    return Err(Box::new(err));
                }
            }
        };
 
        let row: Vec<String> = record
            .iter()
            .map(|field| field.trim().to_string())
            .collect();
 
        csv_data.push(row);        
    }  
 
    Ok(csv_data)
}

2. Write Sample Code

Testers also find it useful sometime to create csv to store data produced by tests.
Here, we will see one example of creating a csv file with a custom type.

2.1 The custom type
struct Cyborg {
    name: String,
    model: String,
    organization: String,
    abilities: String,
    creation_date: Option
}

In this example, we are going to use a custom type named “Cyborg” and write the attributes to a csv file. We are going to use a Writer to take the inputs and write those values in a valid csv format as output. We are specifying the csv_path in the function signature.

2.2 Creating a csv handler
let writer_result = Writer::from_path(csv_path);
let mut writer = match writer_result {
    Ok(writer) => writer,
    Err(err) => return Err(Box::new(err)),
};
2.3 Including the header in the csv file

We are using the write_record to add the headers in the csv file.

writer.write_record(&amp;["name", "model", "organization", "abilities", "creation_date"]);
2.4 Iterate over the records

We iterate over the records and write the records in the csv file.

for cyborg in cyborgs {
   let record_result = match writer.write_record(&[
        cyborg.name,
        cyborg.model,
        cyborg.organization,
        cyborg.abilities,
        cyborg.creation_date.map(|date| date.to_string()).unwrap_or_else(|| "1796-04-02".to_string()),
     ]) {
        Ok(record_result) => record_result,
        Err(err) => return Err(Box::new(err)),            
     };
}

The creation_date above takes a return type of Option<NaiveDate> We wanted to purposefully keep it as Option<NaiveDate> to show the testers as how we can typecast. The easy way would have been to use String but typecasting is one of the thing which testers would come across.

We converted it to Option<String> and than finally to String using map. While converting to a String, we had the option to use unwrap, but since using unwrap is not considered good practice, we opted for unwrap_or_else with a default value of “1796-04-02”.

Fun Fact: 1796-04-02 is the date when Napoleon invaded Italy 🙂

Note: We want to highlight here that online examples were not very helpful here. Most of the examples over the internet just showed how to use unwrap but it is not the best practice.

Writing Tests

We have tried to write few tests for the above read function which would do the following:
a) The function checks whether the csv file is empty. If it discovers any records in the csv file, it throws an error that states,”csv data is expected to be empty, but found records in {} rows”.

b) The function would check for the column count and match with the expected and actual column from the csv file.

c) The function would check for an invalid csv file and would throw an error if it is unable to read it.

d) The function checks for any special characters in the record and if found, it would thrown an error with the message “Special character found in csv data at row {}, column {}”.

All the above mentioned tests are present in the gist.

Conclusion

One of the problems we face as Rust beginners is that online examples are written for more advanced users. For example, most online resources use unwrap() as part of their code snippets assuming that the reader is good enough to know not to use unwrap() in live code. In this post, we have tried to write decent (and explicit) code to help Rust beginners read from and write to csv files. We hope you found this useful.

Hire technical testers from Qxf2!

Qxf2 is a haven for testers with strong technical skills. We maintain a high standard when it comes to adding new members to our team. Finding individuals who excel in both testing and technical expertise can be challenging for many companies. If you fit this description, we encourage you to reach out to us and get connected!

Leave a Reply

Your email address will not be published. Required fields are marked *