Suraj Mulla available

Suraj Mulla

Big Data and Hadoop Developer

Profileimage by Suraj Mulla Big Data and Hadoop Developer from Pune
  • 411052 Pune Freelancer in
  • Graduation: Computer Science
  • Hourly-/Daily rates:
  • Languages: English (Limited professional) | Hindi (Native or Bilingual)
  • Last update: 09.07.2018
Profileimage by Suraj Mulla Big Data and Hadoop Developer from Pune
I have more than 7 years of experience in Big Data, Python and Hadoop. I have worked on various projects where I was able to give exposure to my skills and knowledge.  Now as I have expertise in these technologies I think I can work independently for clients and look for some financial growth. 

Technology Stack:
Datasets - Hive, Pig, Spark, Hadoop, Mapreduce, Flink, NiFi, Storm
Scripting Languages - Python, Scala, Java
Amazon Web Services - EC2, EMR, CloudWatch, Lambda, RDS, ELB, Route S3, Amazon S3 and Redshift.
Along with Hbase, Cassandra, TDCH, Sqoop, OraOp, Oozie, Azkaban, Airflow, Flume and Kafka.

I am very much keen in technologies and work offering to my clients.
A trustful developer with dexterous interpersonal skills and project management skills.
Thank you.
Project:German town cluster stabilization
Technologies:HBase, JAVA, SCALA, YCSB, PHOENIX and JMETER.
Description:Project aimed to stabilize the HBase cluster and balance load. Also, reducing the data volume in the system, stabilize the system by evenly distributing the load on the cluster.
Work Details:Integrated JConsole with HBase and Grafana. Row-Key implementation for the input data into the table (Regions) symmetrically using Scala. Performed YCSB test for the benchmarking of the table.

Project:Google Trends
Technologies: Google Sheet, Amazon S3, Python and AWS Lambda.
Project Description: Project aimed to get the statistical update for the trends that google provide and process them.
Work Details: Created scripts to get the data from google sheet and process at the local end and then load the data to S3 using python. Helped in creating script that was triggered from AWS Lambda. Response was stored in S3 bucket using Python script.

Legal Analytics Platform.    
The product mLeAP product which uses NLP and Machine learning models to enable lawyers to 
analyze cases in the matter of second. Case analysis is based on factual data known to them and 
results can directly be consumed in case presentation.   
Our target is to reduce the legal research time by at least 60%,
which is supposed to be the biggest  contributor to 35 million cases pending in various Indian courts
Link :

Development of In House Data Analytics Platform/Portal with goal of providing user a near real  time data analytics experience. 
1. Sampling of data to reduce unwanted data flowing into system. Sampling  techniques implemented were weighted
 reservoir sampling in distributed architecture and  hash based sampling.    
2. Hive UDAFs for various requirements including logic for probabilistic aggregation over  streaming data set.  
  3. Development of UI and Backend, includes searching and reporting capabilities.  
  4. Test planning, test cases and test scenario reviews.  
  5. Implementation of Agile best practices the team.   
 6. TDD and Code reviews   
 7. Feature improvement and idea suggestions.   
 Platform: Greenplum, Hadoop, Hive, Pig, MapReduce    
Language: Java, Python    
Framework: Django

Apache NiFi/Spark project:
We are currently working on Data Warehousing project for Security and Law firm in London for Security Log generation and analysis. 
We are using Apache NiFi for data stream ingestion, we are processing this data stream with Apache spark then lookup is done using File and Hbase. We have developed dashboard for log monitoring using Kibana.

We are group of freelancers who work around 14 hours a day. 
We are available for 80 hours a week.
We look for remote jobs for now so travel availablity is minimum.