1.System Administration:
●Install, configure, and maintain open-source operating systems (e.g., Linux distributions like Ubuntu, CentOS).
●Perform system updates, patching, and security hardening.
●Monitor system performance and troubleshoot issues.
●Implement and maintain system backups and disaster recovery plans.
2.Database Management:
●Install, configure, and manage open-source database management systems (e.g., PostgreSQL, MySQL, MariaDB).
●Perform database backups, restoration, and replication.
●Optimize database performance and troubleshoot issues.
●Ensure database security and access control.
3.Containerization and Orchestration:
●Manage and maintain containerized applications using Docker.
●Deploy, manage, and scale applications using Kubernetes (K8s).
●Configure and troubleshoot container networking and storage.
●Monitor container and cluster health and performance.
4.Automation and Scripting:
●Develop and maintain scripts (e.g., Bash, Python) to automate system administration tasks.
●Implement infrastructure-as-code (IaC) principles using tools like Ansible, Terraform, or similar.
5.Security:
●Implement and maintain security best practices for all managed systems.
●Regularly audit systems for vulnerabilities and apply necessary patches.
●Configure and manage firewalls, intrusion detection/prevention systems, and other security tools.
6.Collaboration and Documentation:
●Work closely with development and operations teams to support application deployments and infrastructure needs.
●Create and maintain detailed documentation for system configurations, processes, and procedures.
7.Troubleshooting and Problem Solving:
●Identify, diagnose, and resolve system and platform issues in a timely manner.
●Participate in on-call rotation.
We are looking for an enthusiastic and motivated Site Reliability Engineer (SRE) to join our growing team. In this role, you will have the opportunity to learn and contribute to the stability, performance, and scalability of our critical systems. We place a strong emphasis on security in all aspects of our operations. You will work closely with teams to maintain and improve our infrastructure, monitor services, and respond to incidents. This is an excellent opportunity to develop your skills in a dynamic and supportive environment.
<Responsibilities>
1.Assist in Maintaining and Optimizing Infrastructure: Support teams in the day-to-day maintenance and optimization of our infrastructure components.
2.Monitor Services and Address Issues: Monitor system health and service performance, and assist in troubleshooting and resolving issues in a timely manner.
3.Track Resource Usage and System Status: Help monitor various resource indicators and the overall status of the system, contributing to optimization efforts.
4.Support System Stability and Incident Response: Assist in maintaining system stability and participate in incident response procedures under guidance.
5.Contribute to Preventing System Failures: Work with the team to implement measures that help avoid system failures and service interruptions.
6.Collaborate with Other Teams: Work alongside other teams to continuously learn about and contribute to improving system architecture and service quality.
7.Support System Maintenance and Deployment Processes: Assist in the execution of established processes for system maintenance, deployment, and upgrades.
8.Learn and Apply SRE Best Practices: Actively learn and apply SRE principles and best practices in daily tasks.
1.日常維護及技術支援:
-第一線處理包括硬軟體、機房設備及網路相關的 IT 疑難問題,例如:電腦無法開機、無法連線至某個網頁、網路緩慢…等。
-倉庫內系統、網路服務之日常維運作業,並於服務異常時進行故障排除,進而確保員工電腦可穩定運作。
-協助現場工作人員進行設備或服務方式的除錯,若使用者不在工作現場,則透過電話或內部通訊軟體給予協助。
-與其他 IT 組員合作優化基礎建議、穩定倉庫維運服務。
2.資產管理與部署:
-管理 IT 資產庫存,確保資源的有效利用。
-如追蹤電腦、螢幕、PDA 及軟體授權等。
3.其他職責:
-週期性檢查 IT 服務並回報、伺服器系統更新、撰寫技術或除錯文件等。
Key Responsibilities:
Manage the System Center System(MECM、SCOM、SCVMM...) infrastructure , including design, configuration, maintenance, and optimization, to ensure system performance and stability.
Administer System Center System Server, including configuration, maintenance, and upgrades.
Manage the Antivirus(TrendMicro Apex One) system infrastructure , including design, configuration, maintenance, and optimization, to ensure system performance and stability.
Administer Apex One Server, including policy configuration, maintenance, and upgrades.
Monitor system performance, promptly resolve system issues, and minimize downtime.
Collaborate with teams to implement new systems or upgrades, including requirements analysis, planning, and execution.
Prepare and update system documentation, including system settings and operational manuals.
Conduct regular system backups and disaster recovery planning.
1. Responsible for EDI and Logistic related system development, maintenance and daily operation.
2. Discuss and define the process with external customer and internal user.
3. Logistic (Direct Ship) relative project implementation and system development and maintenance.
4. Occasional on-call support for urgent requirement from external customer and internal user when outside of normal business hours.
5. Occasional business trip to support onsite worldwide is required.