发布时间:2024年10月18日
岗位职责与任职要求
Job Description: We are seeking an experienced Python Developer to join our team, responsible for data analysis, pipeline construction, and workflow management. The ideal candidate should be proficient in using Python for data analysis, including libraries such as NumPy, Pandas, and Scikit-learn, and have basic machine learning capabilities. Familiarity with Airflow for workflow management and knowledge of cloud platforms (GCP or Azure) are also required. Strong English communication skills are essential.
Responsibilities:
• Design, build, and maintain efficient data pipelines to support various data analysis and business needs.
• Perform data extraction, cleaning, transformation, and loading (ETL) operations using Python, with proficiency in libraries such as NumPy for numerical operations, Pandas for data manipulation, and Scikit-learn for basic machine learning tasks.
• Apply basic machine learning techniques to enhance data processing and analysis.
• Manage and schedule workflows using Airflow, ensuring the reliability and scalability of data pipelines.
• Participate in the design and optimization of data models to ensure efficiency and maintainability.
• Collaborate with data scientists and analysts to understand business requirements and provide corresponding data solutions.
• Ensure data integrity, security, and privacy, adhering to relevant data protection regulations.
• Stay updated on the latest data technologies and tools, and propose improvements.
Requirements:
• Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent work experience.
• Proficiency in Python programming with strong data analysis and processing skills, including:
o NumPy for numerical computations and array manipulations.
o Pandas for data manipulation and analysis.
o Scikit-learn for implementing basic machine learning models.
o Matplotlib and Seaborn for data visualization.
o SQLAlchemy for database interactions.
• Familiarity with Apache Airflow, with the ability to design and manage workflows independently.
• Knowledge of Google Cloud Platform (GCP) and Microsoft Azure, with the capability to deploy and manage data pipelines in cloud environments.
• Proficiency in SQL with strong database querying and manipulation skills.
• Excellent problem-solving abilities and team collaboration skills, with the ability to work in a fast-paced environment.
• Strong communication skills in English, with the ability to clearly articulate technical issues and solutions.
Preferred Qualifications:
• Practical experience with GCP or Azure projects.
• Familiarity with big data technologies (e.g., Hadoop, Spark).
• Experience in data warehouse construction (e.g., BigQuery, Snowflake).
• Understanding of data protection regulations (e.g., GDPR, CCPA).