職責:
我們正在尋找技術專家
1. 在雲服務開通方面:在GCP等主流雲提供商上運營和管理應用系統。
2. 在發布管理中:合規、組裝、交付源代碼到容器鏡像中,並進一步部署到各種格式的基礎設施中。
負責項目:
1. 部署、管理和操作可擴展、高可用性和容錯系統。
2. 將現有的本地應用程序遷移到雲端。
3. 根據計算、數據或安全要求選擇合適的雲服務。
4. 估算雲使用成本並確定運營成本控制機制。
5. 執行測試腳本來構建軟件包,發布工程師確保新產品的配置和編碼正確,以便成功集成和運行。
6. 構建測試環境並解決與軟件性能相關的任何問題。與 RD 合作解決任何問題並記錄修復以供將來參考資料使用。
7. 構建工具以支持軟件工程流程、審查工程實踐、協助研究新技術,並與開發團隊會面討論未來需求。他們還為完成的產品提供持續支持並維護服務器。
8. 處理升級事件並在需要時提供 On Call 的支持。
要求:
1. 3 年以上雲環境配置、運營和管理經驗
2. 掌握 CI/CD 工具和方法。
3. 擁有 Kubernetes、Docker-compose 和容器化方面的經驗
4. 擁有 配置管理和 infrastructure as code 的經驗。
5. 擁有 Site Reliability Engineering 或 DevOps 方面的經驗更理想。
6. 跨部門的溝通能力。
7. 願意在高增長/擴展技術環境中工作。
8. 經驗/接觸較少的候選人將被視為 SysOps 工程師
工具:Tereform , argoCD , jenkins , shell
====
Responsibilities
1. We are looking for technical experts
In cloud service provisioning: operate and manage application systems on mainstream cloud providers such as AWS, GCP, Azure, Aliyun.
2. In release management: comply, assemble, deliver source code into container images and further deploy in infrastructure with various formats.
You are responsible for
1. Deploy, manage, and operate scalable, highly available, and fault tolerant systems.
2. Migrate an existing on-premises application to cloud.
3. Select the appropriate cloud service based on compute, data, or security requirements.
4. Estimating cloud usage costs and identifying operational cost control mechanisms.
5. Execute test scripts to build software packages, release engineers ensure that new products are configured and coded properly for successful integration and operations.
6. Build test environments and troubleshoot any issues pertaining to the software’s performance. They work with software engineers to resolve any issues and document fixes for use in future reference materials.
7. Build tools to support the software engineering process, review engineering practices, assist in researching new technologies, and meet with the development team to discuss future needs. They also provide ongoing support for completed products and maintain servers.
8. Handle incident escalation and provide on-call support if necessary.
Requirements:
1. 3+ years experience in provisioning, operating, and managing cloud environments
2. Mastery of CI/CD tools and methodologies.
3. Experience in Kubernetes, Docker orchestration and containernation
4. Experience in configuration management and infrastructure as code.
5. Experience as a site reliability or devops would be ideal.
6. Excellent communication skills.
7. Working in a high-growth/scaling technical environment.
8. Candidate with less experience/exposure will be considered as SysOps Engineer
【Who We Are?】
Hytech是一個年輕、充滿活力的團隊,專注於推動金融科技行業的企業技術轉型,是全球領先的管理技術諮詢公司。創新思維和扁平化的管理,讓團隊成員以公開、透明的方式自在工作,也為全球客戶提供卓越的商業價值服務。
【Why Join The Team?】
Hytech 團隊在共事的過程中核心技術會與時俱進,即時討論,並且有良好的溝通管道,扁平化管理,任何問題或意見都可以討論及合作解決。密切的與跨國同事團隊交流。我們的工程師不用輪班,更沒有長期加班的惡性文化。
About the role:
我們正在尋找一位具備監控平台規劃與維運經驗的DevOps 工程師,能夠協助團隊打造高可用性、可觀測性與自動化的雲端與應用服務環境。您將需要與開發團隊、資安團隊及基礎架構團隊緊密合作,確保系統在快速成長的環境中依然保持穩定性、彈性與安全性;您也將協助規劃與維運監控及告警平台,提升系統可見度與故障預防能力。這是一個能夠結合技術專業與跨團隊協作的關鍵角色,若您對提升系統可靠性與打造高效能、可擴展性平台架構充滿熱情,我們誠摯邀請您加入!
What this job involves:
1. Build and maintain CI/CD pipelines using tools such as Jenkins and Bitbucket.
(建立與維護 CI/CD 流程,例如 Jenkins、Bitbucket 等)
2. Design, deploy, and optimize monitoring systems (e.g., Prometheus, Grafana, CloudWatch, ELK/Loki).
(設計、部署與優化監控系統,如 Prometheus、Grafana、CloudWatch、ELK/Loki)
3. Implement infrastructure and application-level monitoring and alerting rules to detect anomalies in real time.
(實作基礎架構與應用層監控及告警規則,確保異常可即時偵測與通知)
4. Analyze monitoring data to identify bottlenecks and propose optimization solutions.
(分析監控數據,發現流程/問題瓶頸並提出優化建議)
5. Maintain centralized log collection systems to ensure traceability and compliance (e.g., PCI DSS).
(維護日誌收集與集中化系統,確保可追蹤性及符合稽核需求,例如 PCI DSS)
6. Perform automated deployments using Infrastructure-as-Code (IaC) tools such as Terraform and Ansible.
(使用基礎架構即程式(IaC)工具,如 Terraform、Ansible,進行自動化部署)
7. Manage cloud resources and support cost optimization in AWS and Alibaba Cloud.
(參與雲端資源管理與成本優化,涵蓋 AWS 與阿里雲)
8. Collaborate with the team on incident response and root cause analysis (RCA).
(與團隊合作進行事故應變及問題根因分析(RCA))