7 Tips for Data Wrangling 4 min read

7 Tips for data wrangling

If you are new to data wrangling for film productions, or if you are looking for some tips on the involved activities, you are at the right place. In this article we will point out some important responsibilities of data wranglers, for ensuring a safe handling of the camera material.

Here are our 7 tips for data wrangling:

1. Prepare Your Process

Considering the busyness on set, dealing with multiple camera cards can very likely get stressful and confusing. For this reason you should always make sure that you stick to a defined process even in hectic situations. This process can for example involve color labeling for cards that have not yet been copied, combined with a “right pocket – left pocket” policy for already copied and not yet copied cards. This can help you always stay on top of your game for the backup of camera cards. If you structurize your general process less things go wrong.

2. Always Use a Secure Backup Method (Data Integrity)

Checking that data is identical in the copy process is one of the most crucial points of the backup process on set. The consistency of digital files can be guaranteed via checksums. Make sure to use tools that leverage checksums to ensure copy consistency. You can also do that check manually but we would recommend using software like e.g. Pomfort Silverstack for a secure backup method and data integrity.

3. Guarantee Data Completeness with Multiple Instances (Data Redundancy)

Creating at least 2 backups of a camera card is the baseline of this point. As the card is probably going to be formatted at some point, two (better three) identical copies of the camera card are the minimum requirement for guaranteeing completeness at any time. Extending the idea of identical data, the multiple instances make sure that even if one instance gets corrupt there is still another left to be used.

4. Data Rate: Ensure Reasonable I/O Speeds to Handle the Data in Time

To ensure fast copy of all material, you have to make sure that the I/O speeds of you drives fit the type and quantity of the shot material. As an example: Shooting RAW with 3 cameras will require a quite different setup than shooting a compressed format with only one camera. As a rule of thumb you should at least be able to copy as fast as new material is created. Calculation of data rates and copy times in advance can help you to choose or recommend the right hardware for the shoot, so you can be sure to have everything you need for a fast copy on set.

The limiting factor doesn’t always have to be the storage hardware: Slow interfaces, or also the “MD5” checksum type can limit the copy speed to a certain amount even with fast drives. Try to rule out such additional factors in advance (and use suitable interfaces for attaching drives and select non-limiting checksum methods like xxHash).

The last three tips go into a more advanced direction, so make sure to primarily take care of the first four. Nonetheless the last three help you complete your job with excellence:

5. Use Source Verification to Assure Unaltered Source Data

To find out if the camera card is intact and still can record and provide camera data reliably, one solution is to read the files on the card twice within one job. Comparing the resulting hashes will give you confidence that copies can be performed deterministically. This process is often referred to as “Source Verification” and adds another layer of security on the side of the camera medium.

6. Extract and Acquire Relevant Metadata

Metadata about the produced clips is crucial for the whole post production process. Getting basic metadata from the clips like timecode and running time sets a good foundation. Getting more detailed technical, camera-related metadata such as exposure settings, and extending it with information from set by adding e.g. comments or slate data can be even more beneficial along the production way.

7. Provide Detailed Reports

By providing the right people with the right metadata you can heavily support your production. Creating detailed reports for the different crafts on set and for post helps them to do their job better. This information can be targeted in detail to what the different people are looking for. Directors are interested in different information than the production. Furthermore reports can also serve as a proof of the work you have done.

The good news is: You are not left alone implementing all of the above tips! There is professional software that supports you with many of your tasks. Our data management application Silverstack for example copies with checksums, guarantees data integrity, helps you assuring data redundancy and creates a detailed library that includes detailed metadata, just to mention a few things.

Clever Media Management for Film Productions

Check out how Silverstack can improve your daily workflows!



About the Author
Samuel is a product manager for Pomfort's on-set applications. Usually you can catch him working on new specs for the software products, writing documentation and shooting videos for our products – and sometimes writing blog posts about workflows and equipment.