{"id":18593,"date":"2023-08-08T09:05:19","date_gmt":"2023-08-08T13:05:19","guid":{"rendered":"https:\/\/qxf2.com\/blog\/?p=18593"},"modified":"2023-08-08T09:05:19","modified_gmt":"2023-08-08T13:05:19","slug":"airflow-control-ec2-instance-start-stop","status":"publish","type":"post","link":"https:\/\/qxf2.com\/blog\/airflow-control-ec2-instance-start-stop\/","title":{"rendered":"Using Airflow to start and stop EC2 instances"},"content":{"rendered":"<p>In this post we will show you how to use <a href=\"https:\/\/airflow.apache.org\/\">Airflow<\/a> to start and stop <a href=\"https:\/\/docs.aws.amazon.com\/AWSEC2\/latest\/UserGuide\/concepts.html\">EC2<\/a> instances. Airflow is a popular open-source platform that engineering teams use to manage workflows. It uses a concept called Directed Acyclic Graphs (<a href=\"https:\/\/airflow.apache.org\/docs\/apache-airflow\/stable\/core-concepts\/dags.html\">DAGs<\/a>) which lets you chain multiple steps into a workflow. Airflow&#8217;s popularity is also partly due to an extensive library of <a href=\"https:\/\/airflow.apache.org\/docs\/apache-airflow\/stable\/_api\/airflow\/operators\/index.html\">operators<\/a>. Operators allow you to use code others have written within your workflow. <\/p>\n<h3>1. Background<\/h3>\n<p>As the industry sees a rise in applications that are loosely coupled microservices, we notice that the amount of infrastructure that a tester has to interact with has increased too. Testing is evolving too. It is not uncommon for Qxf2 testers to have to start and stop instances automatically as part of some specific testing workflows. But most of our clients do not want to give us direct access to their <a href=\"https:\/\/www.techtarget.com\/searchcloudcomputing\/definition\/Platform-as-a-Service-PaaS\">PaaS<\/a> provides (AWS, Azure. etc). Instead, what is more common is that one tester is given access and then they setup Airflow jobs for the remaining testers. That way, the testing team only needs access to Airflow. <\/p>\n<p><a href=\"https:\/\/qxf2.com\/?utm_source=airflow_ec2&#038;utm_medium=click&#038;utm_campaign=From%20blog\">Qxf2<\/a> has fully embraced Airflow as our preferred tool for automating a broad range of routine tasks. <\/p>\n<hr>\n<h3>2. Airflow Start &#038; Stop EC2 Instances DAG:<\/h3>\n<p>In order to play along with this post, you need access to an Airflow instance and AWS credentials to access an EC2 instance. You will also need minimal knowledge of DAGs in Airflow so you can place the code in the right place. We will learn how to start an instance, stop it and also check the instance state in order to verify if the start\/stop operation succeeded. <\/p>\n<p><strong>Note:<\/strong> For this post, we will pretend that we want to start and stop instances according to a schedule. We are doing this because we find auto-scheduled start\/stop to be useful in many testing scenarios.<\/p>\n<h4> 2.1) DAG to start EC2 instance:<\/h4>\n<p>To start the EC2 instance, we will use the <a href=\"https:\/\/airflow.apache.org\/docs\/apache-airflow-providers-amazon\/3.2.0\/_api\/airflow\/providers\/amazon\/aws\/operators\/ec2\/index.html\">EC2StartInstanceOperator<\/a>. In your <code>airflow\/dags\/<\/code> directory, create a Python file (e.g.: start_ec2_instance_dag.py) and perform the following steps.<\/p>\n<h5> 2.1.1) Import all required packages: <\/h5>\n<pre lang=\"python\">\r\nimport datetime\r\nfrom airflow.models import DAG, Variable\r\nfrom airflow.models.baseoperator import chain\r\nfrom airflow.providers.amazon.aws.operators.ec2 import EC2StartInstanceOperator\r\nfrom airflow.providers.amazon.aws.sensors.ec2 import EC2InstanceStateSensor\r\n<\/pre>\n<h5> 2.1.2) Read, set, and retrieve the Instance ID value from the environment: <\/h5>\n<p> We save the Instance ID value as an Environment Variable <b>&#8220;START_INSTANCE_ID&#8221;<\/b> and then set it in the DAG using variable.set(). To read the INSTANCE_ID value which was set in the above we used variable.get(). The environment variable has to exported in the system where Airflow is running.<\/p>\n<pre lang=\"python\">\r\ninstance_id = os.environ.get(\"START_INSTANCE_ID\")\r\nif instance_id is not None:\r\n    Variable.set(\"START_INSTANCE_ID\", instance_id)\r\nelse:\r\n    print(\"INSTANCE_ID environment variable not found.\")\r\n# retrieve the Instance_ID\r\nINSTANCE_ID = Variable.get(\"INSTANCE_ID\")\r\n<\/pre>\n<h5> 2.1.3) Defining Start EC2 Instance DAG: <\/h5>\n<p>The code below defines a DAG for starting an EC2 instance. It sets various parameters such as the unique identifier (<strong>dag_id<\/strong>), the <strong>schedule<\/strong> for executing the DAG, the <strong>start date<\/strong> of the DAG, and <strong>tags<\/strong> for organizing the DAGs. The <strong>catchup flag<\/strong> ensures that any missed runs will be executed.<\/p>\n<p>In this code, the DAG context is established, and the <strong>EC2StartInstanceOperator<\/strong> is used to start the instance. Specific parameters, such as the unique <strong>task ID<\/strong>, <strong>instance_id<\/strong>, and <strong>aws_conn_id<\/strong> (set to the default AWS profile), are configured. The <strong>region<\/strong> is also specified, and <strong>check_interval<\/strong> is used to verify if the EC2 instance has started successfully. Additionally, the code includes settings for <strong>retries<\/strong> in case of failures and the <strong>retry_delay<\/strong> period. <\/p>\n<pre lang=\"python\">\r\n# Start the EC2 instance on every week day 'Monday' morning 7:30 AM. Change schedule value as per requirement.\r\nwith DAG(\r\n    dag_id='Start_Newsletter_Staging_ec2_instance',\r\n    schedule=\"30 7 * * 1\",\r\n    start_date=datetime.datetime(2023, 7, 31),\r\n    tags=['example'],\r\n    catchup=False,\r\n) as dag:\r\n    # [START of the program- ec2_start_instance operator]\r\n    start_instance = EC2StartInstanceOperator(\r\n        task_id=\"ec2_start_instance\",\r\n        instance_id=INSTANCE_ID,\r\n        aws_conn_id=\"aws_default\",\r\n        region_name=\"us-east-1\",\r\n        check_interval=15,\r\n        retries =  3,  # number of retries for the task\r\n        retry_delay=datetime.timedelta(minutes=5)  # retry delay\r\n    )\r\n    # [END of the program- ec2_start_instance operator]\r\n<\/pre>\n<h5> 2.1.4) Monitoring EC2 Instance State: <\/h5>\n<p>The below code is to monitor the state of EC2 instance. To check the state of the instance we have used <a href=\"https:\/\/airflow.apache.org\/docs\/apache-airflow-providers-amazon\/stable\/_api\/airflow\/providers\/amazon\/aws\/sensors\/ec2\/index.html\">EC2InstanceStateSensor<\/a>. Defined parameters are the unique identifier <strong>task_id<\/strong>, <strong>instance_id<\/strong>, <strong>target_status<\/strong>(this parameter specifies the sensor is waiting for the EC2 instance to reach), <strong>retries<\/strong> and <strong>retry_delay<\/strong><\/p>\n<pre lang=\"python\">\r\n instance_state = EC2InstanceStateSensor(\r\n        task_id=\"ec2_instance_state\",\r\n        instance_id=INSTANCE_ID,\r\n        target_state=\"running\",\r\n        retries=3,  # number of retries for the task\r\n        retry_delay=datetime.timedelta(minutes=5),\r\n<\/pre>\n<h5> 2.1.5) Sequence of Tasks: <\/h5>\n<p>Now chain the two tasks to start and then sense the state by adding the following line.<\/p>\n<pre lang=\"python\">\r\nchain(start_instance, instance_state)\r\n<\/pre>\n<p>And that&#8217;s it. You now have a DAG to start an EC2 instance.<\/p>\n<h4> 2.2) DAG to stop EC2 instance:<\/h4>\n<p>In this scenario, we need to utilize the <strong><a href=\"https:\/\/airflow.apache.org\/docs\/apache-airflow-providers-amazon\/2.3.0\/_api\/airflow\/providers\/amazon\/aws\/operators\/ec2_stop_instance\/index.html\">EC2StopInstanceOperator<\/a><\/strong> to stop an EC2 instance that is in the &#8220;running&#8221; state. This step is similar to starting an EC2 instance. So we will not work through it step by step. Instead, the entire code is provided below. You can place the code in <code>airflow\/dags\/stop_ec2_instance_dag.py<\/code><\/p>\n<pre lang=\"python\">\r\n\"\"\"\r\nUsing Airflow operators to stop an EC2 instance\r\n\"\"\"\r\nimport os\r\nimport datetime\r\nfrom airflow.models import DAG, Variable\r\nfrom airflow.models.baseoperator import chain\r\nfrom airflow.providers.amazon.aws.operators.ec2 import EC2StopInstanceOperator\r\nfrom airflow.providers.amazon.aws.sensors.ec2 import EC2InstanceStateSensor\r\n\r\n#set the Instance ID variable. export the STOP_INSTANCE_ID\r\ninstance_id = os.environ.get(\"STOP_INSTANCE_ID\")\r\nif instance_id is not None:\r\n    Variable.set(\"STOP_INSTANCE_ID\", instance_id)\r\nelse:\r\n    print(\"INSTANCE_ID environment variable not found.\")\r\n\r\n# Get INSTANCE_ID\r\nINSTANCE_ID = Variable.get(\"STOP_INSTANCE_ID\")\r\n# Stop the Instance every week day 'Friday' midnight 12 AM. Change Scheduler as per requirement.\r\nwith DAG(\r\n    dag_id='Stop_Newsletter_Staging_ec2_instance',\r\n    schedule=\"00 12 * * 5\",\r\n    start_date=datetime.datetime(2023, 7, 31),\r\n    tags=['example'],\r\n    catchup=False,\r\n) as dag:\r\n    \r\n    # [START of the program ec2_stop_instance operator]\r\n    stop_instance = EC2StopInstanceOperator(\r\n        task_id=\"ec2_stop_instance\",\r\n        instance_id=INSTANCE_ID,\r\n        aws_conn_id=\"my_aws_connection\",\r\n        region_name=\"us-east-1\",\r\n        check_interval=15,\r\n        retries =  3,  # number of retries for the task\r\n        retry_delay=datetime.timedelta(minutes=5)\r\n    )\r\n    # [END of the program ec2_stop_instance operator]\r\n\r\n    instance_state = EC2InstanceStateSensor(\r\n        task_id=\"ec2_instance_state\",\r\n        instance_id=INSTANCE_ID,\r\n        target_state=\"stopped\",\r\n    )\r\n    # [END of program - ec2 instance state sensor]\r\n    chain(stop_instance, instance_state)\r\n    \r\n<\/pre>\n<h4>3) <u>Non-Default value for AWS connection: <\/u>: <\/h4>\n<p>aws_conn_id is set to default value for AWS connection in the Start &#038; Stop EC2 instance operation. To provide non-default value for AWS connection use the provided code in this <a href=\"https:\/\/gist.github.com\/nelabhotlaR\/10fcaa3e6f9b7dece68087f58538e060\">gist<\/a>.<\/p>\n<h4>4) <u>Conclusion<\/u>:<\/h4>\n<p>We hope this post helps a few testers. Qxf2 is finding that working with orchestration tools and knowing how to configure jobs and workflows in them helps testers integrate better with teams. These tools have the additional benefit of enabling the tester to independently execute tests that previously required help from developers and DevOps members.<\/p>\n<hr>\n<h4>Hire Qxf2 for your technical testing needs<\/h4>\n<p>Do you need technical software testers? We provide senior QA engineers who embed themselves into your team for short periods of time. We implement advanced testing tools and techniques to enable your development team to move faster. Our testers will also spend time coaching the team on better testing practices and implement examples. Once the team is self-reliant, we leave. <a href=\"https:\/\/qxf2.com\/contact?utm_source=airflow_ec2&#038;utm_medium=click&#038;utm_campaign=From%20blog\">Contact us<\/a> to discuss your requirements and explore how we can support your testing needs<\/p>\n<hr>\n","protected":false},"excerpt":{"rendered":"<p>In this post we will show you how to use Airflow to start and stop EC2 instances. Airflow is a popular open-source platform that engineering teams use to manage workflows. It uses a concept called Directed Acyclic Graphs (DAGs) which lets you chain multiple steps into a workflow. Airflow&#8217;s popularity is also partly due to an extensive library of operators. [&hellip;]<\/p>\n","protected":false},"author":37,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[348,358],"tags":[],"class_list":["post-18593","post","type-post","status-publish","format-standard","hentry","category-airflow","category-aws-ec2"],"_links":{"self":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/18593","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/users\/37"}],"replies":[{"embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/comments?post=18593"}],"version-history":[{"count":87,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/18593\/revisions"}],"predecessor-version":[{"id":19416,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/posts\/18593\/revisions\/19416"}],"wp:attachment":[{"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/media?parent=18593"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/categories?post=18593"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/qxf2.com\/blog\/wp-json\/wp\/v2\/tags?post=18593"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}