Introduction to Automating Development Work: A Complete Guide to ETL (Python), Bots (Slack/Discord), CI/CD (GitHub Actions), and Monitoring (Sentry/Datadog)
Introduction
In software development, repetitive manual work not only reduces productivity but also increases the risk of human error. For that reason, automating the development flow is extremely important for improving project stability and speed. This article explains in detail automation in the following nine areas:
- Automating data processing (using ETL scripts)
- Turning routine work into Bots (introducing Slack / Discord Bots)
- Automating infrastructure management (applying Terraform / Pulumi)
- Automating deployment (optimizing CI/CD)
- Automating error handling (using Sentry / Datadog)
- Automating Pull Request labeling and merging
- Automating log monitoring and anomaly detection
- Automating tests (introducing E2E / unit tests)
- Automating warning and alert notifications (using PagerDuty / Opsgenie)
We will also introduce concrete tools and code samples, so feel free to use them as a reference.
Automating Data Processing (Using ETL Scripts)
What is ETL (Extract, Transform, Load)?
ETL refers to the process of Extracting, Transforming, and Loading data. By automating periodic data processing, you can reduce manual workload and ensure data consistency.
Tools used
- Apache Airflow: Ideal for workflow management
- dbt (Data Build Tool): Write data transformation processes concisely
- Pandas (Python library): Convenient for scripting data processing
Sample ETL Code Using Python + Pandas
import pandas as pd
def extract_data(file_path):
return pd.read_csv(file_path)
def transform_data(df):
df['processed_at'] = pd.Timestamp.now()
df = df.dropna()
return df
def load_data(df, output_path):
df.to_csv(output_path, index=False)
# Example
file_path = "data/input.csv"
output_path = "data/output.csv"
df = extract_data(file_path)
df = transform_data(df)
load_data(df, output_path)
Turning Routine Work into Bots (Introducing Slack / Discord Bots)
Benefits of Introducing Bots
- Reduce effort by automating routine tasks
- Send instant notifications and reminders
- Automatically respond to user inquiries
Tools used
- Slack API / Discord API: Create chatbots
- AWS Lambda / Google Cloud Functions: Run Bots serverlessly
- Python (discord.py, slack_sdk): Easily develop Bots
Sample Slack Bot Code (Python)
import os
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
client = WebClient(token=os.getenv("SLACK_BOT_TOKEN"))
def send_message(channel, text):
try:
response = client.chat_postMessage(channel=channel, text=text)
except SlackApiError as e:
print(f"Error sending message: {e.response['error']}")
# Example
send_message("#general", "Automated message!")
Automating Infrastructure Management (Applying Terraform / Pulumi)
Challenges in infrastructure management include the fact that manual infrastructure configuration is error-prone and difficult to scale. Therefore, it is important to minimize the occurrence of mistakes through code-based management.
Tools used
- Terraform: Codify infrastructure (Infrastructure as Code)
- Pulumi: Manage infrastructure using programming languages (TypeScript, Python, etc.)
Sample Terraform Code (Creating an AWS EC2 Instance)
provider "aws" {
region = "us-east-1"
}
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t2.micro"
}
Automating Deployment (Optimizing CI/CD)
It is important to automatically test and deploy code changes to improve development efficiency and quality.
Tools used
- GitHub Actions / GitLab CI/CD / CircleCI
- Docker (containerization)
Sample GitHub Actions Workflow
name: Deploy
on: [push]
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version: '18'
- name: Install dependencies
run: npm install
- name: Run tests
run: npm test
- name: Deploy
run: npm run deploy
Automating Error Handling (Using Sentry / Datadog)
It is important to detect errors that occur during operations in real time and respond quickly.
Tools used
- Sentry: Real-time error logging
- Datadog: Application monitoring and metrics visualization
Introducing Sentry (JavaScript)
import * as Sentry from "@sentry/browser";
Sentry.init({
dsn: "https://your_dsn@sentry.io/your_project_id",
});
There is also a blog post on introducing Sentry into a Next.js project, so please refer to it as well.
Automating Pull Request Labeling and Merging
By using GitHub Actions, you can automatically add labels to Pull Requests based on their content. By placing a labeler.yml like the following in .github/workflows/, you can add labels according to the file changes.
name: "PR Labeler"
on:
pull_request:
types: [opened, synchronize]
jobs:
label:
runs-on: ubuntu-latest
steps:
- name: Label PR
uses: actions/labeler@v4
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
Setting Up Automatic Merging
Using GitHub Actions, you can also automatically merge PRs that meet specific conditions.
name: "Auto Merge"
on:
pull_request:
types: [labeled]
jobs:
auto-merge:
runs-on: ubuntu-latest
steps:
- name: Merge PR
uses: pascalgn/automerge-action@v0.15.0
with:
github_token: "${{ secrets.GITHUB_TOKEN }}"
merge_method: "squash"
With this configuration, for example, PRs with the auto-merge label will be merged automatically.
Automating Log Monitoring and Anomaly Detection
It is important to build a mechanism that allows you to respond quickly when the system detects anomalies. Representative tools include:
- AWS CloudWatch (for AWS environments)
- Datadog (feature-rich with extensive dashboards)
- Prometheus (open-source monitoring system)
Example of Anomaly Detection Using AWS CloudWatch
By configuring an AWS CloudWatch alarm as shown below, you can send notifications when an anomaly occurs.
{
"AlarmName": "High CPU Usage",
"MetricName": "CPUUtilization",
"Namespace": "AWS/EC2",
"Statistic": "Average",
"ComparisonOperator": "GreaterThanThreshold",
"Threshold": 80,
"EvaluationPeriods": 2,
"AlarmActions": ["arn:aws:sns:us-east-1:123456789012:NotifyMe"]
}
This JSON configuration is used to create an AWS CloudWatch alarm. It triggers an alert when a specific metric (in this case, the CPU utilization of an EC2 instance) exceeds a threshold. It is used to detect abnormal CPU usage and prompt the operations team or notification system to respond immediately.
#### Actual Behavior
- CPU usage of an AWS EC2 instance exceeds 80% (first time)
- If it also exceeds 80% in the next monitoring period (EvaluationPeriods = 2)
- A notification is sent to the SNS (Simple Notification Service) topic arn:aws:sns:us-east-1:123456789012:NotifyMe
- Via SNS, alerts are sent to email, SMS, Lambda functions, Opsgenie, etc.
- The operations staff responds, or an automatic remediation script is executed
Notes
- Preventing over-detection
- Set EvaluationPeriods appropriately so that temporary CPU spikes do not generate unnecessary alerts
- For example, by setting EvaluationPeriods = 3, an alarm will only be triggered if CPU usage exceeds 80% three times in a row
- Optimizing notifications
- Use SNS to send notifications to appropriate destinations (Slack, Opsgenie, PagerDuty)
- Can be combined with Lambda to implement automatic scale-out processing
- Checking the region
- Confirm that the region (us-east-1) in arn:aws:sns:us-east-1:123456789012:NotifyMe is correct
- It may not work if it is not set to the same region as the monitored resource
Automated Testing Using Jest / Playwright
You can automate unit tests with Jest and integrate them into your CI/CD pipeline using GitHub Actions.
name: "Run Tests"
on:
push:
branches:
- main
jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Install dependencies
run: npm install
- name: Run tests
run: npm test
Playwright is effective for E2E tests.
import { test, expect } from '@playwright/test';
test('basic test', async ({ page }) => {
await page.goto('https://example.com');
await expect(page).toHaveTitle(/Example Domain/);
});
There is also a blog post on how to introduce Playwright, so please refer to it as well.
Automating Warning and Alert Notifications (Using PagerDuty / Opsgenie)
Incident Management Using PagerDuty
PagerDuty is an incident management platform that integrates with system monitoring tools and alert management systems. When specific thresholds are exceeded, it automatically sends notifications and enables rapid response.
{
"routing_key": "YOUR_ROUTING_KEY",
"event_action": "trigger",
"payload": {
"summary": "High CPU Usage Alert",
"severity": "critical",
"source": "server-1"
}
}
The JSON above is a sample request for triggering an incident using PagerDuty’s Events API v2. Specifically, it is configured to notify PagerDuty of an incident indicating that CPU usage is abnormally high (High CPU Usage Alert).
After registering an incident in PagerDuty, you manage incidents with the following actions:
- Trigger: Create a new incident
- Acknowledge: Mark the incident as being handled by a responder
- Resolve: Mark the incident as resolved
Integration with Opsgenie
Opsgenie is a tool that streamlines alert management and incident response, and it can be easily integrated with communication tools such as Slack and Microsoft Teams.
By integrating them, you can immediately notify team members when an alert occurs, enabling rapid response.
How Opsgenie Integration Works
In Opsgenie, you can integrate with external tools using Integrations.
You can integrate with tools such as:
Notification tools: Slack, Microsoft Teams, Email, SMS, etc.
Monitoring tools: Datadog, Prometheus, AWS CloudWatch, New Relic, etc.
Ticketing systems: Jira, ServiceNow, Zendesk, etc.
In this example, we will explain how to configure integration between Slack and Opsgenie.
integration:
name: "Slack Integration"
type: "slack"
enabled: true
By configuring it this way, you can send alerts to Slack.
Conclusion
In this article, we explained development flow automation in detail from the following nine perspectives:
- Automating data processing (using ETL scripts)
- Turning routine work into Bots (introducing Slack / Discord Bots)
- Automating infrastructure management (applying Terraform / Pulumi)
- Automating deployment (optimizing CI/CD)
- Automating error handling (using Sentry / Datadog)
- Automating Pull Request labeling and merging
- Automating log monitoring and anomaly detection
- Automating tests (introducing E2E / unit tests)
- Automating warning and alert notifications (using PagerDuty / Opsgenie)
By introducing automation, you can improve development productivity and enable smoother operations. Try introducing automation tools that fit your project!
Questions about this article 📝
If you have any questions or feedback about the content, please feel free to contact us.Go to inquiry form
Related Articles
Complete Guide to Web Accessibility: From Automated Testing with Lighthouse / axe and Defining WCAG Criteria to Keyboard Operation and Screen Reader Support
2023/11/21Robust Authorization Design for GraphQL and REST APIs: Best Practices for RBAC, ABAC, and OAuth 2.0
2024/05/13Complete Cache Strategy Guide: Maximizing Performance with CDN, Redis, and API Optimization
2024/03/07CI/CD Strategies to Accelerate and Automate Your Development Flow: Leveraging Caching, Parallel Execution, and AI Reviews
2024/03/12Strengthening Dependency Security: Best Practices for Vulnerability Scanning, Automatic Updates, and OSS License Management
2024/01/29Bringing a Go + Gin App Up to Production Quality: From Configuration and Structure to CI
2023/12/06Cloud Security Measures in Practice with AWS & GCP: Optimizing WAF Configuration, DDoS Protection, and Access Control
2024/04/02ESLint / Prettier Introduction Guide: Thorough Explanation from Husky, CI/CD Integration, to Visualizing Code Quality
2024/02/12