【Who We Are?】
Hytech是一個年輕、充滿活力的團隊,專注於推動金融科技行業的企業技術轉型,是全球領先的管理技術諮詢公司。創新思維和扁平化的管理,讓團隊成員以公開、透明的方式自在工作,也為全球客戶提供卓越的商業價值服務。
【Why Join The Team?】
Hytech 團隊在共事的過程中核心技術會與時俱進,即時討論,並且有良好的溝通管道,扁平化管理,任何問題或意見都可以討論及合作解決。密切的與跨國同事團隊交流。我們的工程師不用輪班,更沒有長期加班的惡性文化。
About the role:
我們正在尋找一位具備監控平台規劃與維運經驗的DevOps 工程師,能夠協助團隊打造高可用性、可觀測性與自動化的雲端與應用服務環境。您將需要與開發團隊、資安團隊及基礎架構團隊緊密合作,確保系統在快速成長的環境中依然保持穩定性、彈性與安全性;您也將協助規劃與維運監控及告警平台,提升系統可見度與故障預防能力。這是一個能夠結合技術專業與跨團隊協作的關鍵角色,若您對提升系統可靠性與打造高效能、可擴展性平台架構充滿熱情,我們誠摯邀請您加入!
What this job involves:
1. Build and maintain CI/CD pipelines using tools such as Jenkins and Bitbucket.
(建立與維護 CI/CD 流程,例如 Jenkins、Bitbucket 等)
2. Design, deploy, and optimize monitoring systems (e.g., Prometheus, Grafana, CloudWatch, ELK/Loki).
(設計、部署與優化監控系統,如 Prometheus、Grafana、CloudWatch、ELK/Loki)
3. Implement infrastructure and application-level monitoring and alerting rules to detect anomalies in real time.
(實作基礎架構與應用層監控及告警規則,確保異常可即時偵測與通知)
4. Analyze monitoring data to identify bottlenecks and propose optimization solutions.
(分析監控數據,發現流程/問題瓶頸並提出優化建議)
5. Maintain centralized log collection systems to ensure traceability and compliance (e.g., PCI DSS).
(維護日誌收集與集中化系統,確保可追蹤性及符合稽核需求,例如 PCI DSS)
6. Perform automated deployments using Infrastructure-as-Code (IaC) tools such as Terraform and Ansible.
(使用基礎架構即程式(IaC)工具,如 Terraform、Ansible,進行自動化部署)
7. Manage cloud resources and support cost optimization in AWS and Alibaba Cloud.
(參與雲端資源管理與成本優化,涵蓋 AWS 與阿里雲)
8. Collaborate with the team on incident response and root cause analysis (RCA).
(與團隊合作進行事故應變及問題根因分析(RCA))
About
Want to build a worldwide brand from Taiwan, and to communicate our brand story to millions of users worldwide?
Want to be based in Taiwan but work in a silicon-valley-like environment, and to build world-class brand and products?
Want to participate in the global fintech and blockchain movement, and work at an English-speaking workplace?
Come change the world with us! Join this fast-growing startup founded by software veterans and funded by top VCs, Skype co-founders, and the Taiwanese government (NDF)!
We’re hiring for an experienced Senior SRE Engineer. The exact mix of other skills does not matter, so long as your tool chest includes a mix of abilities. Be willing to attack anything that comes your way, learn on the fly and get things done. Come talk to us if you want to push your skillset in a dynamic fast-paced environment.
Responsibilities
1. 負責日常 AWS 線上營運平台運維工作, 保障系統7*24小時穩定運行、系統監控、應用監控、日誌監控、元件升級、安全事件回應處理、成本控制,資源管理和分配等
2. 負責日常AWS 線上營運平台各項問題/緊急狀況處理/排查/追蹤/回報
3. 分析系統瓶頸,優化架構和優化性能
4. 監控系統 Zabbix、Nagios 和 ELK 建置、維護、告警處理及調整,並能依照特定需求完成自訂腳本掌握系統運作狀態
5. 協助應用系統、資料庫高可用部署、備份、故障排除
6. 配合後端、產品與建置伺服器等架構
7. 定期的報告、報表製作與事件紀錄
8. 協助排除內部 IT 問題以及內部資訊環境建置及維護
9. 配合公司安排 oncall
Requirements
1. 5年及以上的 Linux 系統使用和管理運維經驗
2. 有7*24運維工作經驗佳
3. 熟悉 AWS 雲端平台,如:
AWS EC2
AWS APP sync
AWS API Gateway
AWS Networking (firewalls and routing)
AWS VPC permissions and routing
AWS Lambda functions
AWS Aurora (MySQL but cloud-based)
AWS Elasticache (explicitly REDIS)
AWS Cloudfront
AWS Cloudwatch
AWS Security and protection systems
AWS EKS
AWS IAM
AWS parameter store and secret manager
Amazon Simple Notification Service
Dashboard systems such as Grafana
Scripting languages such as Bash script, Python, Golang
Container systems such as Kubernetes
4. 熟悉自動化組態管理工具: Terraform, Helm, Kustomize
5. 熟悉 Linux 環境下的系統管理、網絡管理、監控、問題追踪及故障排除
6. 熟悉 CICD pipeline such as Jenkins, github action, argo Workflow, ArgoCD
7. 熟悉 Airflow 包括建置與撰寫DAG
8. 熟悉大型網站系統架構,EKS、MongoDB、Kafka以及相關應用的部署、備份、復原、調教、優化, 包括:Web伺服器、資料庫、流量管理、負載均衡、消息隊列、高可用解決方案等
9. 具備相關資訊安全知識
10. 具有良好的溝通能力和團隊合作能力, 具有較強的抗壓能力和學習能力,能夠獨立高效地發現和解決問題
Location: Taipei
https://goo.gl/maps/vC7WxAurcZVWwCCNA
About XREX
https://www.xrex.io/
Culture
https://downloads.xrex.io/culture
-We will proceed your application first if you apply online:
https://xrex.breezy.hr/p/16372b5314e8