Project-based Payment to develop a code using Pig Latin or Ipython
Negotiable

I have here a CSV file with two columns and lots of records. One column has long unstructured text. I need this column to be parsed. I need three data from it. Two of which can be done using regex. The third needs to be discussed pa if regex can be applied.

Second issue is I need it to be implemented with cluster computing using mapreduce functions for fast processing of the records.

Please PM me if you're interested. Let's talk.