companymlabs logo

Web Scraping Specialist

mlabsRemote — New Jersey, United States
Remote Full-time $75K/yr - $150K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Requirements:Extraction Expertise: Proven experience in extracting data from complex websites with minimal supervision, supported by a portfolio of relevant projects. Technical Proficiency: Advanced knowledge of Python or JavaScript, particularly with libraries and frameworks such as BeautifulSoup, Scrapy, or Selenium. Advanced Programming: Strong understanding of asynchronous programming, multithreading, and distributed scraping architectures. Web Fundamentals: In-depth expertise in HTML, CSS, JavaScript, and the Document Object Model (DOM). Data Storage: Familiarity with NoSQL databases (e.g., MongoDB, Cassandra), including the capability to devise efficient storage solutions. Cloud Infrastructure: Experience in deploying and managing large-scale scraping jobs utilizing cloud services such as AWS, Google Cloud, or Azure. Preferred Skills: Capability to apply machine learning algorithms for data cleaning, categorization, or predictive analysis; active engagement in relevant communities or initiatives.

About the job

mlabs is hiring a Web Scraping Specialist to support large-scale data extraction for AI model training. This full-time position is fully remote, but requires at least six hours of workday overlap with the Eastern Standard Time (EST) zone. The team manages distributed crawlers and complex pipelines that process billions of data points, including video, transcripts, and audio.

Compensation

Annual salary ranges from $75,000 to $150,000, depending on experience.

What You Will Do

  • Develop and optimize code: Build, test, and refine high-performance scraping solutions for a wide range of online sources. Focus on reliability and efficiency.
  • Oversee data retrieval: Manage complex extraction tasks, including handling pagination and dynamic content such as AJAX-loaded pages.
  • Ensure data quality: Clean and format collected data to meet strict standards for downstream analysis and processing.
  • Database management: Organize and store scraped data in appropriate databases, with attention to speed and long-term integrity.
  • Monitor and maintain systems: Continuously track scraping operations and infrastructure to quickly identify and resolve issues, maintaining steady data flow.

Work Environment

This role suits someone who enjoys technical challenges and prefers working without heavy bureaucracy. Expect a hands-on, collaborative setting focused on delivering results.

About mlabs

mlabs is a forward-thinking organization specializing in the development of AI technologies. With a commitment to innovation and quality, they manage a robust infrastructure that processes extensive web data, playing a crucial role in the advancement of machine learning and artificial intelligence.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.