IT |
ITmedia 総合記事一覧 |
[ITmedia News] 映画「呪術廻戦0」興収123億円、歴代暫定22位に 12日から最後の入場者プレゼント |
https://www.itmedia.co.jp/news/articles/2203/07/news162.html
|
itmedia |
2022-03-07 20:48:00 |
IT |
ITmedia 総合記事一覧 |
[ITmedia ビジネスオンライン] バーミヤン、一部店舗で1649円の「飲茶食べ放題」 狙いは? |
https://www.itmedia.co.jp/business/articles/2203/07/news160.html
|
itmedia |
2022-03-07 20:37:00 |
IT |
ITmedia 総合記事一覧 |
[ITmedia News] 「ハイキュー!!」10周年記念本に“アクスタ30体”同梱版 「豪華」「コスパ良すぎ」と話題に |
https://www.itmedia.co.jp/news/articles/2203/07/news161.html
|
itmedia |
2022-03-07 20:08:00 |
Ruby |
Rubyタグが付けられた新着投稿 - Qiita |
Ruby on Rails で Qiita::Markdownを |
https://qiita.com/zhangyouqiyou/items/3aa60ae49cd87ce4881c
|
RubyonRailsでQiitaMarkdownをRubyonRailsでQiitaMarkdownを導入gemをインストールGemfileに下記を記述しbundleinstallを実行gemqiitamarkdownbundleinstall実行するもエラーAnerroroccurredwhileinstallingruggedandBundlercannotcontinueMakesurethatgeminstallruggedvsourcesucceedsbeforebundlingネット検索の結果下記コマンドを実行geminstallruggedvsourcebrewinstallcmake無事インストールできたみたいなので再度bundleinstallを実行無事gemをインストールできました。 |
2022-03-07 20:42:57 |
AWS |
AWSタグが付けられた新着投稿 - Qiita |
AWS EC2 AmazonLinux2 composerをバージョンを指定してインストールする |
https://qiita.com/miriwo/items/1c58776f3520610a9cf5
|
AWSECAmazonLinuxcomposerをバージョンを指定してインストールする概要ECのAmazonLinuxインスタンスにcomposerを入れてcomposerコマンドを実行できるようにする。 |
2022-03-07 20:49:54 |
AWS |
AWSタグが付けられた新着投稿 - Qiita |
AWS EC2 AmazonLinux2 extrasを使ってPHP8をインストールする |
https://qiita.com/miriwo/items/9d876de230b66c70bb39
|
sudoyuminstallamazonlinuxextras下記を実行してPHPのパッケージが含まれている事を確認する。 |
2022-03-07 20:45:46 |
AWS |
AWSタグが付けられた新着投稿 - Qiita |
AWS Lightsail × WordPress × contact form 7 環境からGmailを送信できるようにする |
https://qiita.com/suama-akdo5317/items/9832bbcb176ee1d67c7d
|
|
2022-03-07 20:12:16 |
Azure |
Azureタグが付けられた新着投稿 - Qiita |
Azure Web App で React App のデプロイが反映されない際の解決方法 |
https://qiita.com/Futo_Horio/items/f28ab486a63ff8bfa71f
|
その後、コンテナに含まれているシェルスクリプトoptstartupstartupshを実行して、nodeoptstartupdefaultstaticsitejsコマンドで、WebApp初期ページを表示するためのアプリケーションを起動している。 |
2022-03-07 20:36:41 |
Ruby |
Railsタグが付けられた新着投稿 - Qiita |
【Rails】オリジナルアプリ 作成の手順 その1【備忘録】 |
https://qiita.com/Rkujira3206/items/0ee2796289c9be45fcf3
|
、だれに向けてなのか決める、DBの洗い出し、機能はどのようなものにするのかアイデアを出す。 |
2022-03-07 20:57:05 |
海外TECH |
MakeUseOf |
The 6 Best Practices for Application Whitelisting |
https://www.makeuseof.com/application-whitelisting-practices/
|
learn |
2022-03-07 11:45:13 |
海外TECH |
MakeUseOf |
Canva vs. Pixlr: Which Free Design Software Is Better? |
https://www.makeuseof.com/canva-vs-pixlr/
|
design |
2022-03-07 11:15:13 |
海外TECH |
DEV Community |
Take your Serverless Functions to new speeds with Appwrite 0.13 |
https://dev.to/appwrite/take-your-serverless-functions-to-new-speeds-with-appwrite-013-5868
|
Take your Serverless Functions to new speeds with Appwrite What are Cloud Functions Cloud functions are a way of extending a cloud provider s services to execute your code and add functionality that did not previously exist Quite a few services have this functionality Some examples include AWS Lambda Google Cloud Functions and Vercel Functions Amazon led the charge into cloud functions when they introduced AWS Lambda back in with Google following up four years later making Google Cloud Functions public for all in All of that brings us to today where Appwrite is introducing generation of Appwrite Functions with a significantly improved execution model Architecture OverviewSo what s changed compared to previous versions Well as noted our execution model has been wholly re envisioned with speed in mind We first need to know how the original execution model worked to understand the changes The diagram above shows how the original function execution model worked It would do this for every single execution So essentially with each execution we were spinning up a new Docker container This flow takes plenty of time and can put quite a lot of stress on the host machine Now compare this to s model The updated model is a lot more complicated even though this is a significantly simplified graph and is no longer spinning up a new runtime with every execution Not only that but each runtime now has a web server inside of it to handle executions Instead of using command line executions we now use HTTP Requests making executions much faster Using this method of execution does mean a couple of changes for the users For instance users must enter their script s filename instead of the entire command They also now have to export their function More details on this can be found in our functions documentation Dependency ManagementRemember having to package your dependencies with your function code manually Well no more With Appwrite we have introduced a build stage into functions that automatically install any dependencies you need The build stage is also used to build the compiled runtimes ready for execution Specific steps may be required for some languages so we recommend checking our updated functions documentation BenchmarksThanks to our new execution model functions are now over times as fast as before We have even introduced the ability to use compiled languages for the first time in Appwrite introducing Rust and an improved Swift runtime into the mix with some awe inspiring execution times Why don t we check out some solid benchmark numbers comparing to in execution time and scale Our first test is a simple “Hello World response from NodeJS using the asynchronous execution model We use the asynchronous method to compare the two versions because does not support synchronous functions and comparing asynchronous with synchronous would not be fair We use k as the benchmarking tool that runs on our local device for all scripts To properly benchmark executions we prepare a proxy that can freeze requests and count how many requests hit the proxy in a specific timespan The flow looks as follows Create Appwrite function tag and activate itSpin up a proxy server with request freezing enabledRun the k benchmark for secondsUnfreeze proxy serverWait for all executions to finishWith this setup k will create as many executions as possible in seconds Only the first execution will start during this time but won t finish until the proxy server is frozen Such freezing ensures that executions don t eat up CPU while benchmarking how many executions can Appwrite create After one minute the k benchmarks are completed and we unfreeze the proxy server thus resuming the executions queue We let all executions finish while tracking timing data on the proxy server The results of this benchmark were breathtaking Appwrite created executions while version created executions A tiny improvement of only was expected as we did not refactor the creation process in this version The stunning performance can be seen when comparing how long these functions took to execute While Appwrite ran at the rate of executions per minute with Appwrite the rate was shocking executions per minute The second test we ran was using Appwrite in a real world scenario to see how the average execution time was improved For this test we prepared the same script in six different runtimes We used an example script of converting a phone number to the country code as covered in the Open Runtimes examples GitHub repository The script ran the following commands Fetch country phone prefixes database from Appwrite Locale SDKValidate request payloadFind a match in prefixReturn country informationFor benchmarking these functions we used the same k technology but this time in a simpler flow Create function deployment and activate itRun the benchmark for secondsWait for executions to finishCalculate average execution timeWe ran these scripts in six different runtimes languages in both and versions and the results dropped our jaws The most surprising result was in Dart where we managed to run this script in less than one millisecond On that note let s start with Dart Dart is a compiled language meaning the result of a build is a binary code This makes the execution extremely fast as everything is ready for our server in zeros and ones Due to bad support for compiled languages in the average execution time of our function was ms Our expectations were pretty high when running the same script in but an incredible drop to ms average execution time left us speechless for sure Let s continue with another commonly used language NodeJS The same function that took ms to execute in only took ms in This result was the most surprising for me as I didn t expect such great results from an interpreted language To follow up NodeJS we compared Deno which had similar results averaging around ms in version The gap was slightly smaller as this function only took ms in We continued the test with PHP a well known language running numerous websites on the internet While the average stood at ms in the average dropped to ms We continued to run the test in Python and Ruby both with similar results In Python took ms to execute while Ruby averaged at ms Believe it or not in version Python only took an incredible ms with Ruby a little bit faster ms As you can see the execution rate has significantly improved with this release and we look forward to seeing how developers using Appwrite will utilize these new features Engineering ChallengesTo allow for synchronous execution and prioritize speed we decided to depart from the task based system that most of our workers use and instead create a new component to Appwrite called the executor The executor would handle all orchestration and execution responsibilities and remove the Docker socket from the functions worker The executor is an HTTP Server built with Swoole and Utopia using various Appwrite libraries to interact with the database One of the initial challenges was creating an orchestration library that would allow us to easily switch away from Docker down the line if we wanted to This change would enable us to use other orchestration tools like Kubernetes or Podman and allow our users to run Appwrite using their favorite orchestration tools Currently we have two adapters for this library Docker CLI and Docker API however we plan to grow this selection as time progresses The next challenge was that Swoole unfortunately had a few bugs in their coroutine cURL hook which we use to communicate with the Docker API version of the Orchestration library These bugs forced us to use the Docker CLI adapter which caused higher wait times and rate limiting problems which caused instability with higher loads Torsten a Software Engineer at Appwrite came up with a great solution to introduce a queue like system for utilizing Docker This change would make bringing up runtimes slightly slower but actual executions always use a cURL request so execution times would not be negatively affected by this change ConclusionWith these improvements we hope that Appwrite can deliver even more to help you build your dream application without sacrificing speed or flexibility We all look forward to seeing what everyone builds with these new features and improvements to function execution speed We encourage you to join our Discord server where you can keep up to date with the latest of Appwrite and get help if you need it from the very same people that develop Appwrite |
2022-03-07 11:49:25 |
海外TECH |
DEV Community |
Orchestrating hybrid workflows using Amazon Managed Workflows for Apache Airflow (MWAA) |
https://dev.to/aws/orchestrating-hybrid-workflows-using-amazon-managed-workflows-for-apache-airflow-mwaa-2boc
|
Orchestrating hybrid workflows using Amazon Managed Workflows for Apache Airflow MWAA Using Apache Airflow to orchestrate hybrid workflowsIn some recent discussions with customers the topic of how open source is increasingly being used as a common mechanisms to help build re usable solutions that can protect investments in engineering and development time skills and that work across on premises and Cloud environment In my most viewed blog post talked about how you can build and deploy containerised applications anywhere Cloud your data centre other Clouds and on anything Intel and Arm I wanted to combine the learnings from that post and the code and apply it to another topic I have been diving deeper into Apache Airflow I wanted to explore how you can combine the two to see how you can start to build data pipelines that work across hybrid architectures seamlessly Use CaseSo why would we want to do this I can see a number of real world applications but the two that stand out for me are where you want to leverage and use existing legacy heritage systems within your data analytics pipelineswhere local regulation and compliance places additional controls as to where data can be processedIn this post I will show how you can address both of these use cases combining open source technology and a number of AWS products and services that will enable you to orchestrate workflows across heterogeneous environments using Apache Airflow In this demo I want to show you how you might approach orchestrating a data pipeline to build a centralised data lake across all your Cloud and non Cloud based data silos that respect local processing and controls that you might have This is what we will end up building As always you can find the code for this walk through in this GitHub repo blogpost airflow hybridApproachFirst up we need to create our demo customer data I used Mockaroo and found it super intuitive to generate sample data and then use that to setup a sample customer database running on MySQL I am not going to cover setting that up but I have included the data scripts in the repo and there is a section at the end of this blog where I share my setup if you want to set this up yourself The demo will have two MySQL databases running with the same database schema but with different data One will be running on an Amazon RDS MySQL instance in AWS and the other I have running on my local Ubuntu machine here at Home HQ The goal of this demo walk through is to orchestrate a workflow that will allow us to do batch extracts from these databases based on some specific criteria to simulate some regulator controls for example and upload these into our data lake We will be used an Amazon S bucket for this purpose as it is a common scenario The approach I am going to take is to create an Apache Airflow workflow DAG and leverage an Apache Airflow operator ECSOperator which allows us to launch container based images The container based images we launch will contain our ETL code and this will be parameterised so that we can re use this multiple times changing the behaviour by providing parameters during launch for example different SQL queries Finally we will use ECS Anywhere which uses the open source amazon ecs agent to simplify how we can run our containers anywhere in the Cloud on premises or on other Clouds To make this blog post easier to follow I will break it down into the different tasks First creating our ETL container that we can use to extract our data Following that I will show how we can run this via Amazon ECS and then walk through the process of deploying ECS Anywhere so we can run it on Amazon EC instances Cloud and on my humble very old Ubuntu box on prem With that all working we will then switch to Apache Airflow and create our DAG that puts these pieces together and creates a workflow that will define and launch these containers to run our ETL scripts and upload our data in our data lake Time to get started Pre reqsIf you want to follow along then you will need the following An AWS account with the right level of privileges and the AWS cli set up and running locallyAWS CDK installed and configured the latest version v or v Docker and Docker Compose in order to be able to build your container images and test locallyAccess to an AWS region where Managed Workflows for Apache Airflow is supported I will be using eu west London An environment of Amazon Managed Workflows for Apache Airflow already setup You can follow an earlier post and I have included the AWS CDK code on how to build that environment within the repoMySQL databases running a database and data you can query it doesn t have to be the same as I have used you can use your own I have provided the sql customer dummy data incase you wanted to use that if you try this demo for yourself They key thing here is that you have one running locally and one running in the cloud I am using Amazon RDS MySQL but you could easily run MySQL a different way if you want I have provided some details on how I set this up at the end of the blog post AWS Secrets configured to contain connection details to the MySQL databasesThe AWS cost for running this for hours is around according to my info on the AWS Billing console Creating our ETL containerCreating our ETL scriptFirst up we need to create our Python code that will do our ETL magic For the sake of simplicity I have created a simple script that takes some parameters and then runs a query and uploads the results to Amazon S read data q pyfrom copy import copyfrom mysql connector import MySQLConnection Errorfrom python mysql dbconfig import read db configimport sysimport csvimport botoimport jsonimport socketdef query with fetchone queryrun secret region try Grab MySQL connection and database settings We areusing AWS Secrets Manager but you could use another service like Hashicorp Vault We cannot use Apache Airflow to store these as this script runs stand alone secret name secret region name region session boto session Session client session client service name secretsmanager region name region name get secret value response client get secret value SecretId secret name info json loads get secret value response SecretString pw info password un info username hs info host db info database Output to the log so we can see and confirm WHERE we are running and WHAT we are connecting to print Connecting to str hs database str db as user str un print Database host IP is socket gethostbyname hs print Source IP is socket gethostname conn MySQLConnection user un password pw host hs database db cursor conn cursor query queryrun print Query is str query cursor execute query records cursor fetchall c csv writer open temp csv w c writerows records print Records exported for row in records print row row row row row row row row except Error as e print e finally cursor close conn close def upload to s sbucket sfolder region We will upload the temp temp csv file and copy it based on the input params of the script bucket and dir file try s boto client s region name region s upload file temp csv sbucket sfolder except FileNotFoundError print The file was not found return False except Error as e print e if name main try arg sys argv except IndexError raise SystemExit f Usage sys argv lt s bucket gt lt s file gt lt query gt lt secret gt lt region gt The script needs the following arguments to run Target S bucket where the output of the SQL script will be copied Target S folder filename The query to execute The parameter store we use AWS Secrets which holds the values on where to find the MySQL database The AWS region sbucket sys argv sfolder sys argv queryrun sys argv secret sys argv region sys argv query with fetchone queryrun secret region upload to s sbucket sfolder region AWS SecretsThis script takes a number of arguments documented above in the script and you may see that the th argument is a value we have not yet mentioned the parameter store When putting a solution like this we need a central store that our containers will be able to access some key information in a secure way where ever they might be running It felt wrong to include credential information in files when using this approach it is an option and you can do that with the code in the repo if you wish I decided I would use AWS Secrets to store the credentials and connection details for the MySQL databases You could of course use other services such as HashiCorp s Vault to do a similar thing All we need is a way of being able to store important credentials access them where ever our container needs to run Our script needs four pieces of information database host database name user to connect to the database and password for the user Create a json file I called my tmp elt json as follows and then change the values for the ones you need username app password secure password host localmysql airflow hybrid database localdemo aws secretsmanager create secret name localmysql airflow hybrid description Used by Apache Airflow ELT container secret string file tmp elt json region your region Which gives me this output ARN arn aws secretsmanager eu west xxxxx secret localmysql airflow hybrid XXXXXX Name localmysql airflow hybrid VersionId ffccf b ce bbbc fbaecd And we can check that this has stored everything by trying to retrieve the secret Note this will display the secret values so be careful when using this command to ensure you do not accidentally disclose those values aws secretsmanager get secret value secret id localmysql airflow hybrid region your region And if successful you will see your values The Python script we have created does the same as this step except it uses the boto library in order to interact with and obtain these values You should now have defined AWS Secrets for your MySQL databases In my specific case the two that I created are localmysql airflow hybrid this is configured to look for a database on my local network via a host of localmysql beachgeek co ukrds airflow hybrid this is configured to point to my MySQL database configured within Amazon RDSWe are now ready to connect to these databases and start extracting some valuable insights Testing our ScriptTo make sure everything is working before we package up and containerise it we can run this from the command line You can run the following command from the docker folder python app read data q pyWhich gives us this informational messageUsage app read data q py lt s bucket gt lt s file gt lt query gt lt secret gt lt region gt This is expected as we have not provided any arguments Now we can try with some arguments python app read data q py ricsue airflow hybrid customer regional csv select from customers WHERE country Spain rds airflow hybrid eu west and a couple of things should happen First you should see some out like the following Connecting to demords cidwsoyye eu west rds amazonaws com database demo as user adminDatabase host IP is Source IP is aeabcb ant amazon comQuery is select from customers WHERE country Spain Records exported Wiatt Revell wrevellq umn edu Female Spain Sheppard Rylett sryletthj java com Genderfluid Spain Sloane Maylour smaylourlq und de Female Spainthe second is that if you go to your Amazon S bucket you should now have a folder and file called using the example above customer regional csv Assuming you have success we are now ready to move to the next stage which is packaging up this so we can run it via a container Containerising our ETL scriptMost organisations will have their own workflow for containerising applications and so feel free to adopt use those if you want In order to containerise our application we need to create a Docker file As we are using Python I am going to select a Python base image for this container public ecr aws docker library python latest The script we created in the previous step is in an app folder so we set this with the WORKDIR We then copy to and run PIP to install our Python dependencies mysql connector python and boto We then copy the files and specify how we want this container to start when it is run executing python app read data py FROM public ecr aws docker library python latestWORKDIR appCOPY requirements txt requirements txtRUN pip install r requirements txtCOPY CMD python app read data py To build package our container we need to Build our image using docker build Tag the image and then Push the image to a container repository To simplify this part of the process you will find a script setup sh which takes our Python script and create our container repository in Amazon ECR builds and then tag s before pushing the image to that repo Before running the script you just need to change the variables AWS DEFAULT REGION AWS ACCOUNT and then if you want to customise for your own purposes change AWS ECR REPO and COMMIT HASH usr bin env bash Change these values for your own environment it should match what values you use in the CDK app if you are using this script together to deploy the multi arch demoAWS DEFAULT REGION eu west AWS ACCOUNT AWS ECR REPO hybrid airflowCOMMIT HASH airflw You can alter these values but the defaults will work for any environmentIMAGE TAG COMMIT HASH latest AMD TAG COMMIT HASH amdDOCKER CLI EXPERIMENTAL enabledREPOSITORY URI AWS ACCOUNT dkr ecr AWS DEFAULT REGION amazonaws com AWS ECR REPO Login to ECR aws ecr get login region AWS DEFAULT REGION no include email create AWS ECR Repoif aws ecr describe repositories repository names AWS ECR REPO then echo Skipping the create repo as already exists else echo Creating repos as it does not exists aws ecr create repository region AWS DEFAULT REGION repository name AWS ECR REPOfi Build initial image and upload to ECR Repodocker build t REPOSITORY URI latest docker tag REPOSITORY URI latest REPOSITORY URI AMD TAGdocker push REPOSITORY URI AMD TAG Create the image manifests and upload to ECRdocker manifest create REPOSITORY URI COMMIT HASH REPOSITORY URI AMD TAGdocker manifest annotate arch amd REPOSITORY URI COMMIT HASH REPOSITORY URI AMD TAGdocker manifest inspect REPOSITORY URI COMMIT HASHdocker manifest push REPOSITORY URI COMMIT HASHWhen you run the script you should see output similar to the following From my home broadband connection this took around minutes to complete so the amount it will take will vary depending on how good your internet speed upload is Login SucceededAn error occurred RepositoryNotFoundException when calling the DescribeRepositories operation The repository with name hybrid airflow does not exist in the registry with id Creating repos as it does not exists repository repositoryArn arn aws ecr eu west repository hybrid airflow registryId repositoryName hybrid airflow repositoryUri dkr ecr eu west amazonaws com hybrid airflow createdAt imageTagMutability MUTABLE imageScanningConfiguration scanOnPush false encryptionConfiguration encryptionType AES Building s FINISHED gt internal load build definition from Dockerfile s gt gt transferring dockerfile B s gt internal load dockerignore s gt gt transferring context B s gt internal load metadata for public ecr aws docker library python latest s gt internal load build context s gt gt transferring context kB s gt FROM public ecr aws docker library python latest s gt CACHED WORKDIR app s gt CACHED COPY requirements txt requirements txt s gt CACHED RUN pip install r requirements txt s gt COPY s gt exporting to image s gt gt exporting layers s gt gt writing image sha acfcaaacdbcaebdedcecaf s gt gt naming to dkr ecr eu west amazonaws com hybrid airflow latest sUse docker scan to run Snyk tests against images to find vulnerabilities and learn how to fix themThe push refers to repository dkr ecr eu west amazonaws com hybrid airflow bf Pusheddad Pushing gt MB MBaff Pushedffbcb Pusheddab Pushedaccb Pushedebe Pushing gt MB MBccfb Pushing gt MB MBebdb Pushing gt MB MBabcf Pushing gt MB MBbedceaba Waitingdcccdc Waitingbfff Waiting airflw amd digest sha ffcedfdaafbabdebfbcd size Created manifest list dkr ecr eu west amazonaws com hybrid airflow airflw schemaVersion mediaType application vnd docker distribution manifest list v json manifests mediaType application vnd docker distribution manifest v json size digest sha ffcedfdaafbabdebfbcd platform architecture amd os linux sha bbfffbaeaccedaecafbccfdaIf you open the Amazon ECR console you should now see your new container repository and image The container image should now be available via the following resource uri dkr ecr eu west amazonaws com hybrid airflow airflwTesting our containerised ETL scriptNow that we have this script containerised lets see how it works First of all we can try the followingdocker run dkr ecr eu west amazonaws com hybrid airflow airflwWhich should provide you the following outputUnable to find image dkr ecr eu west amazonaws com hybrid airflow airflw locallyairflw Pulling from hybrid airflowDigest sha bbfffbaeaccedaecafbccfdaStatus Downloaded newer image for dkr ecr eu west amazonaws com hybrid airflow airflwUsage app read data q py lt s bucket gt lt s file gt lt query gt lt secret gt lt region gt Note If you get an error such as docker Error response from daemon pull access denied for dkr ecr eu west amazonaws com hybrid airflow repository does not exist or may require docker login denied Your authorization token has expired Reauthenticate and try again then you can re login to your Amazon ECR environment This is the command I use yours will be different based on the region you are in aws ecr get login region eu west no include email We know this would generate an error as we have not supplied the correct parameters but we can see that it is providing the expected behavior Now lets try with this command docker run dkr ecr eu west amazonaws com hybrid airflow airflw ricsue airflow hybrid customer regional csv select from customers WHERE country Spain rds airflow hybrid eu west This time I get a different error The end part of the error gives us some clues botocore exceptions NoCredentialsError Unable to locate credentialsDuring handling of the above exception another exception occurred Traceback most recent call last File app app read data q py line in lt module gt query with fetchone queryrun secret region File app app read data q py line in query with fetchone cursor close UnboundLocalError local variable cursor referenced before assignmentThe reason for the error is that there are no AWS credentials that this container has in order to interact with AWS We can easily fix that We could add our AWS credentials in our Container but I would NOT recommend that and it is generally a very bad idea to do so Instead we can create two environment variables AWS ACCESS KEY ID and AWS SECRET ACCESS KEY which we will pass into Docker when we run the container as environment variables Note Please do not disclose the values of thsee publicly or store them where others might be able to copy them export AWS ACCESS KEY ID XXXXXXXXXexport AWS SECRET ACCESS KEY XXXXXXXWe can now change our Docker run command slightly and try againdocker run e AWS ACCESS KEY ID AWS ACCESS KEY ID e AWS SECRET ACCESS KEY AWS SECRET ACCESS KEY dkr ecr eu west amazonaws com hybrid airflow airflw ricsue airflow hybrid customer regional csv select from customers WHERE country Spain rds airflow hybrid eu west And this time success Connecting to demords cidwsoyye eu west rds amazonaws com database demo as user adminDatabase host IP is Source IP is edffaeaQuery is select from customers WHERE country Spain Records exported Wiatt Revell wrevellq umn edu Female Spain Sheppard Rylett sryletthj java com Genderfluid Spain Sloane Maylour smaylourlq und de Female SpainFor the eagled eyed you can see the output is the same which is re assuring but the Source IP is different This is because this time the source host was the Docker container running this script What about testing this against our local MySQL server which is running a similar data set I run this command with the only difference being the AWS Secret which determines which MySQL database to try and connect to docker run e AWS ACCESS KEY ID AWS ACCESS KEY ID e AWS SECRET ACCESS KEY AWS SECRET ACCESS KEY dkr ecr eu west amazonaws com hybrid airflow airflw ricsue airflow hybrid customer regional csv select from customers WHERE country Spain localmysql airflow hybrid eu west We get an error Traceback most recent call last File app app read data q py line in query with fetchone print Database host IP is socket gethostbyname hs socket gaierror Errno Name or service not knownDuring handling of the above exception another exception occurred This is to be expected At the beginning of this walk through I explained we were using etc hosts to do resolution of the database host Again Docker allows us to enable Host based lookups with a command line options network host so we can retry with this command docker run network host e AWS ACCESS KEY ID AWS ACCESS KEY ID e AWS SECRET ACCESS KEY AWS SECRET ACCESS KEY dkr ecr eu west amazonaws com hybrid airflow airflw ricsue airflow hybrid customer regional csv select from customers WHERE country Spain localmysql airflow hybrid eu west Success This time we get a different set of data the sample data for the two regions is not the same and we can see the connection details are updated to reflect we are accessing a local database In a real world scenario these would be local DNS resolvable addresses you use to connect to your resources Connecting to localmysql beachgeek co uk database localdemo as user rootDatabase host IP is Source IP is dfaaeQuery is select from customers WHERE country Spain Records exported Dag Delacourt ddelacourtj nydailynews com Male SpainTo recap what we have done so far is to containerise our ETL script and successfully run this against both local and remote instances of our MySQL databases The next stage is to move these onto a container orchestrator I am going to be using Amazon ECS which is my favourite way to run my container applications Right let s get straight to it Running our ETL container on Amazon ECSCreating our ECS ClusterWe are now going to set up an Amazon ECS cluster on AWS in the same region as where we have been working so far We could do this manually but to simplify this part of the setup within the GitHub repo you will find the cdk cdk ecs folder which contains a CDK stack that will deploy and configure an Amazon ECS cluster and then create an ECS Task Definition which will container all the bits of our ETL container we did above I have tested this on both v and v CDK and it works fine with both This will create a VPC and deploy a new ECS Cluster and an ECS node within that cluster We first need to update the app py and update some of the properties You will need to update the AWS and Region account for your own environment The values for ecr repo and image tag need to be the same as what we used in the setup of the Amazon ECR repository the previous steps where we ran the setup sh script the default values used are the same used here Once updated save the file The value for s is the target Amazon S bucket you will upload files to This is used to create permissions in this bucket to copy files If you set this incorrectly the ETL script will fail to upload because of a lack of permissions from aws cdk import corefrom ecs anywhere ecs anywhere vpc import EcsAnywhereVPCStackfrom ecs anywhere ecs anywhere taskdef import EcsAnywhereTaskDefStackenv EU core Environment region eu west account props ecsclustername hybrid airflow ecr repo hybrid airflow image tag airflw awsvpccidr s ricsue airflow hybrid app core App mydc vpc EcsAnywhereVPCStack scope app id ecs anywhere vpc env env EU props props mydc ecs cicd EcsAnywhereTaskDefStack scope app id ecs anywhere taskdef env env EU vpc mydc vpc vpc props props app synth From the command line when you type cdk lsYou should get something like the following ecs anywhere vpcecs anywhere taskdefWe create our Amazon ECS VPC by running cdk deploy ecs anywhere vpcNote Typically an AWS account is limited to VPCs so check your AWS account to see how many you have configured before running this otherwise running this stack will generate an error And when completed we can deploy our Amazon ECS Cluster by running cdk deploy ecs anywhere taskdef answer Y to any prompts you might get which should result in output similar to the following ecs anywhere taskdefOutputs ecs anywhere taskdef ECSClusterName hybrid airflow clusterecs anywhere taskdef ECSRoleName hybrid airflow ECSInstanceRoleecs anywhere taskdef ECSAnywhereRoleName hybrid airflow ExternalECSAnywhereRoleStack ARN arn aws cloudformation eu west stack ecs anywhere taskdef b bc ec adb ecWe will need some of the values that this CDK app has output so take note of these We can validate that everything has been setup by going to the AWS console and viewing the Amazon ECS console Alternatively we can run the following command aws ecs list clusters region your region Which should display something like the following depending on how many other ECS Clusters you might already have setup and we can see that we have our new ECS Cluster the name is defined in the variables in the app py of ecsclustername hybrid airflow following by cluster clusterArns arn aws ecs eu west cluster hybrid airflow cluster Exploring the ECS CDK stackBefore continuing it is worth covering what exactly we just did by exploring the CDK file The CDK file that deployed this ECS Cluster and configured our Task Definition the thing that will run our application is ecs anywhere taskdef py As we can see from the beginning of the file we import the values from the app py props which we will use to define things like the name of the ECS Cluster the Container image permissions and so on class EcsAnywhereTaskDefStack core Stack def init self scope core Construct id str vpc props kwargs gt None super init scope id kwargs We create variables for the container image of our ETL application by referencing the container repository here we are using Amazon ECR but you could use others airflow repo ecr Repository from repository name self Hybrid ELT Repo repository name f props ecr repo airflow image ecs ContainerImage from ecr repository airflow repo f props image tag When we create the ECS Cluster it will create an EC instance and this instance will need an IAM Role to give it the needed permissions This is the ECS Task Execution Role Here we only need to define it and give it a name role name ecscluster role iam Role self f props ecsclustername ecsrole role name f props ecsclustername ECSInstanceRole assumed by iam ServicePrincipal ssm amazonaws com managed policies iam ManagedPolicy from aws managed policy name AmazonSSMManagedInstanceCore ecscluster role add managed policy iam ManagedPolicy from aws managed policy name service role AmazonECContainerServiceforECRole We then create our ECS Cluster give it a name and then provision some EC resources using the IAM Role we just created ecscluster ecs Cluster self f props ecsclustername ecscluster cluster name f props ecsclustername cluster vpc vpc ecscluster add capacity xAutoScalingGroup instance type ec InstanceType t xlarge desired capacity We now need to create a Role for the ECS Task definition This will be the IAM Role that our application will use so we define here the different permissions scoped as low as possible We pass in some of those variables such as the S bucket and the ECS Cluster name First we create the policy data lake s Bucket from bucket name self DataLake f props s data lake arn data lake bucket arn task def policy document iam PolicyDocument statements iam PolicyStatement actions s effect iam Effect ALLOW resources f data lake arn f data lake arn iam PolicyStatement actions ecs RunTask ecs DescribeTasks ecs RegisterTaskDefinition ecs DescribeTaskDefinition ecs ListTasks effect iam Effect ALLOW resources iam PolicyStatement actions iam PassRole effect iam Effect ALLOW resources conditions StringLike iam PassedToService ecs tasks amazonaws com iam PolicyStatement actions logs CreateLogStream logs CreateLogGroup logs PutLogEvents logs GetLogEvents logs GetLogRecord logs GetLogGroupFields logs GetQueryResults effect iam Effect ALLOW resources f arn aws logs log group ecs props ecsclustername log stream ecs And then we create the IAM Roles and attach any managed policies we might need in this case we want to use the Secrets Manager managed policy so we can read our secrets task def policy document role iam Role self ECSTaskDefRole role name f props ecsclustername ECSTaskDefRole assumed by iam ServicePrincipal ecs tasks amazonaws com inline policies ECSTaskDefPolicyDocument task def policy document managed secret manager policy iam ManagedPolicy from aws managed policy name SecretsManagerReadWrite task def policy document role add managed policy managed secret manager policy The final bit is to actually create our Task Definition defining the actual container image the command override the resources etc We also need to define and then create the AWS CloudWatch logging group so we can view the logs in AWS CloudWatch You can configure other logging targets if you want log group log LogGroup self LogGroup log group name f ecs props ecsclustername ec task definition ecs EcTaskDefinition self f props ecsclustername ApacheAirflowTaskDef family apache airflow network mode ecs NetworkMode HOST task role task def policy document role ec task definition add container Hybrid ELT TaskDef image airflow image memory limit mib cpu Configure CloudWatch logging logging ecs LogDrivers aws logs stream prefix f props ecsclustername log group log group essential True command ricsue airflow hybrid period hq data csv select from customers WHERE country Spain rds airflow hybrid eu west And that is it once deployed in around minutes we have our new ECS Cluster with an EC resource up and running and our Task Definition defined and ready to run In theory we can now just run this and it should be the same as when we ran it locally using Docker Running our ELT Container Amazon ECS Task Definition From a command line you can kick off this Task by running the following command export ECS CLUSTER hybrid airflow cluster export TASK DEF apache airflow export DEFAULT REGION eu west aws ecs run task cluster ECS CLUSTER launch type EC task definition TASK DEF region DEFAULT REGIONWhich should create output like this tasks attachments attributes name ecs cpu architecture value x availabilityZone eu west b clusterArn arn aws ecs eu west cluster hybrid airflow cluster containerInstanceArn arn aws ecs eu west container instance hybrid airflow cluster ccbcbaafdcdcb containers containerArn arn aws ecs eu west container hybrid airflow cluster bbcebaeccbaec acc f cdfdcce taskArn arn aws ecs eu west task hybrid airflow cluster bbcebaeccbaec name Hybrid ELT TaskDef image dkr ecr eu west amazonaws com hybrid airflow airflw lastStatus PENDING networkInterfaces cpu memory cpu createdAt desiredStatus RUNNING enableExecuteCommand false group family apache airflow lastStatus PENDING launchType EC memory overrides containerOverrides name Hybrid ELT TaskDef inferenceAcceleratorOverrides tags taskArn arn aws ecs eu west task hybrid airflow cluster bbcebaeccbaec taskDefinitionArn arn aws ecs eu west task definition apache airflow version failures We have configured AWS CloudWatch logging so we can go here to see the output of this From the CDK stack we defined our logging group ecs hybrid airflow so we can open up this log group and we can see an entry in the format ofhybrid airflow Hybrid ELT TaskDef bbcebaeccbaecWhen we open up this stream we can see that the output is exactly what we expected and matches what we can when we ran this container locally Connecting to demords cidwsoyye eu west rds amazonaws com database demo as user adminDatabase host IP is Source IP is ip eu west compute internalQuery is select from customers WHERE country Spain Records exported Wiatt Revell wrevellq umn edu Female Spain Sheppard Rylett sryletthj java com Genderfluid Spain Sloane Maylour smaylourlq und de Female SpainWe have now successfully run our containerised ETL script in the Cloud Next step running this anywhere with ECS Anywhere Deploying ECS AnywhereBefore we can run our containerised ETL script in our local environment we need to install the Amazon ECS Anywhere agent This extends the Amazon ECS control plane and allows you to integrate external resources that allow you to run your Task Definitions your apps wherever the ECS Anywhere agent is running These appear as a new ECS launch type of EXTERNAL whereas when you run your Task Definitions you might typically be using EC or FARGATE Note If you want to dive deeper that I suggest you check out the ECS Workshop where they have a dedicated section on ECS Anywhere To deploy ECS Anywhere we will need to do the following Create a new IAM Role that will be used by ECS Anywhere the control plane Install the ECS Anywhere Agent and integrating into an existing ECS ClusterBefore diving in though which kinds of hosts can you deploy the ECS Anywhere agent onto You can check the latest OSs supported by visiting the FAQ page As of writing this includes Amazon Linux Bottlerocket Ubuntu RHEL SUSE Debian CentOS and Fedora You should also consider the host resources as the CPU and Memory available will dictate how Tasks are executed on which hosts Which ever distribution you use you will need to have the AWS cli installed and configured so make sure this has been done before proceeding NEWS FLASH Announced this week you can now run the ECS Anywhere agent on Windows I am very excited about this and I will try and find a Windows machine on which to try this out as an addendum in the future For the purposes of this walkthrough I have used two different hosts The first is my local Ubuntu box which is sat next to me It has an Intel i CPU quad core and GB of ram The second is an EC instance an mi large which has vCPUs and GB ram This is running Amazon Linux As you will recall from the pre req s on BOTH of these instances I have installed MySQL and amended the etc hosts to add a local hosts entry to the name localmysql beachgeek co uk The EC instance is ip eu west compute internal and we will see this later as confirmation that this local ECS Anywhere agent is running our ETL scripts Creating the IAM RoleBefore we install the ECS Anywhere agent we need to create an IAM Role which will be used by the agent This will have been created by the ECS CDK stack and we can look at the code that creates this external task def policy document role iam Role self ExternalECSAnywhereRole role name f props ecsclustername ExternalECSAnywhereRole assumed by iam ServicePrincipal ecs tasks amazonaws com external managed SSM policy iam ManagedPolicy from aws managed policy name AmazonSSMManagedInstanceCore external managed ECS policy iam ManagedPolicy from aws managed policy name service role AmazonECContainerServiceforECRole external task def policy document role add managed policy external managed SSM policy external task def policy document role add managed policy external managed ECS policy As we can see it adds to AWS IAM Managed policies AmazonSSMManagedInstanceCore and AmazonECContainerServiceforECRole which will provide the ECS Anywhere agent everything it needs to connect to and register as an external managed instance in the ECS Cluster The output of this will be displayed when you deployed the script ecs anywhere taskdef ECSAnywhereRoleName hybrid airflow ExternalECSAnywhereRoleWe will need this when we install the ECS Anywhere agent Installing amp integration of ECS AnywhereThere are two ways you can do this Via a script that is generated via the Amazon ECS Console or via the cli Via the Amazon ECS ConsoleFrom the Amazon ECS Console there is an easy way to add an external resource via the ECS Anywhere agent If you click on the ECS Cluster you will see a button that allows you to Register External Instances When you click on it you will need to select the name of the IAM Role we just created In the example above it will be hybrid airflow ExternalECSAnywhereRole but it may be different if you are following along When you click on NEXT you will see some text You will use this to install the ECS Anywhere agent Copy this and then paste it into a terminal of the system you want to install the ECS Anywhere agent Note Be careful as this activation script is time bound and will expire after a period of time Do not share this and make it public I have changed the details in the screenshot so this is for illustration purposes only The installation process will take minutes and you will get confirmation if successful You can confirm that you now have a new resource in your ECS Cluster by clicking on the ECS Cluster and checking for External Instances highlighted in red you can see here I have two registered Via the Amazon ECS ConsoleTo install via the command line first open up a terminal session in the machine you want to install the agent First I create some environment variables ECS CLUSTER the name of the ECS cluster ROLE the name of the ECS Anywhere Role DEFAULT REGION the AWS region you are working in Using my example above I create the followingexport ECS CLUSTER hybrid airflow cluster export ROLE NAME hybrid airflow ExternalECSAnywhereRole export DEFAULT REGION eu west Now run this command on the machineaws ssm create activation iam role ROLE NAME tee ssm activation jsonWhich will given you output similar to A file will be created ssm activation json which contains this info ActivationCode sAsrZpOktv hFnytAWS ActivationId bdbd b be df We will now create a couple of new environment variables with these values export ACTIVATION ID bdbd b be df export ACTIVATION CODE sAsrZpOktv hFnytAWS Next we download the agent installation scriptcurl o ecs anywhere install sh amp amp sudo chmod x ecs anywhere install shAnd then finally we run the script passing in the environment variables we have already createdsudo ecs anywhere install sh cluster ECS CLUSTER activation id ACTIVATION ID activation code ACTIVATION CODE region DEFAULT REGION The script will now start to install connect and then register the AWS SSM and ECS agents If successful you will see something like this Trying to wait for ECS agent to start Ping ECS Agent registered successfully Container instance arn arn aws ecs eu west container instance hybrid airflow cluster cbcadfcbbd You can check your ECS cluster here clusters hybrid airflow cluster okAs with the Console installation you can check that this has worked by checking the AWS ECS Console You can also use the command line aws ecs list container instances cluster ECS CLUSTERWhich should provide you something similar to the following containerInstanceArns arn aws ecs eu west container instance hybrid airflow cluster cbcadfcbbd arn aws ecs eu west container instance hybrid airflow cluster ccbcbaafdcdcb As you can see the first instance there is the one we have just integrated into the ECS Cluster Now that we have this installed we can test running our ETL container via Amazon ECS TroubleshootingNote I have previously covered installation of the ECS Anywhere on a previous blog post most viewed blog post which provides some additional details that you can refer to if you get stuck Running our ETL script locally We now have EXTERNAL resources within our ECS Cluster which enable us to determine where we want to run our Task Definitions our containerised ETL script in this case We can try this out next Running our Task locallyFrom a command line you can kick off running our containerised ETL script by running the following command You will notice it is exactly the same command the only difference is that the launch type parameter has changed to EXTERNAL export ECS CLUSTER hybrid airflow cluster export TASK DEF apache airflow export DEFAULT REGION eu west aws ecs run task cluster ECS CLUSTER launch type EXTERNAL task definition TASK DEF region DEFAULT REGIONWhen we kick this off we get a similar output Notice the launchType this time we shows up as EXTERNAL tasks attachments attributes name ecs cpu architecture value x clusterArn arn aws ecs eu west cluster hybrid airflow cluster containerInstanceArn arn aws ecs eu west container instance hybrid airflow cluster cbcadfcbbd containers containerArn arn aws ecs eu west container hybrid airflow cluster fcecbfacbac efd aad adfd de taskArn arn aws ecs eu west task hybrid airflow cluster fcecbfacbac name Hybrid ELT TaskDef image dkr ecr eu west amazonaws com hybrid airflow airflw lastStatus PENDING networkInterfaces cpu memory cpu createdAt desiredStatus RUNNING enableExecuteCommand false group family apache airflow lastStatus PENDING launchType EXTERNAL memory overrides containerOverrides name Hybrid ELT TaskDef inferenceAcceleratorOverrides tags taskArn arn aws ecs eu west task hybrid airflow cluster fcecbfacbac taskDefinitionArn arn aws ecs eu west task definition apache airflow version failures As we did before we can check in the CloudWatch logs and we can see this worked We can see that the Source IP is our local ECS Anywhere instance Connecting to demords cidwsoyye eu west rds amazonaws com database demo as user adminDatabase host IP is Source IP is ip eu west compute internalQuery is select from customers WHERE location Spain Records exported Wiatt Revell wrevellq umn edu Female Spain Sheppard Rylett sryletthj java com Genderfluid Spain Sloane Maylour smaylourlq und de Female SpainAccessing local resourcesThe previous example showed us running our containerised ETL script but accessing resources in the Cloud For the specific use case we are looking at hybrid orchestration we really need to show this accessing local resources We will do that in the next section as we move to how we can orchestrate these via Apache Airflow Creating and running our hybrid workflow using Apache AirflowUp until now we have validated that we can take our ETL script containerise it and then run it anywhere Cloud locally and even our developer setup The next stage is to incorporate this as part of our data pipeline using Apache Airflow Setting up Apache AirflowTo start off with we need an Apache Airflow environment I have put together a post on how you can get this up and running on AWS using Managed Workflows for Apache Airflow MWAA which you can check out here I have included the code within the code repo ECSOperatorApache Airflow has a number Operators which you can think of as templates that make it easier to perform tasks These Operators are used when you define tasks and you pass in various details and the code behind the Operators does the heavy lifting There are a number of Operators that enable you to work with AWS services these are packaged into what is called the Amazon provider package and the one we are interested in is the ECS Operator which allows us to define and run ECS Tasks i e our ETL container script MWAA uses dedicated workers to execute the tasks and in order to govern what these worker nodes can do they have an IAM Role The IAM Role known as the MWAA Execution Policy governs what resources your workflows have access to We need to add some additional permissions if we want our workflows and therefore the worker nodes to be able to create and run ECS Tasks The required permissions have been added to the CDK script but are here for reference iam PolicyStatement actions ecs RunTask ecs DescribeTasks ecs RegisterTaskDefinition ecs DescribeTaskDefinition ecs ListTasks effect iam Effect ALLOW resources iam PolicyStatement actions iam PassRole effect iam Effect ALLOW resources conditions StringLike iam PassedToService ecs tasks amazonaws com Creating our WorkflowWithe the foundational stuff now done we can create our workflow It starts off like any other typical DAG by importing the Python libraries we are going to use As you can see we import the ECSOperator from airflow import DAGfrom datetime import datetime timedeltafrom airflow providers amazon aws operators ecs import ECSOperatordefault args owner ubuntu start date datetime retry delay timedelta seconds We now create our task using the ECSOperator and supplying configuration details In the example below I have hard coded some of the values for example the ECS Cluster name and Task Definitions but you could either store these as variables within Apache Airflow a centralised store like AWS Secrets or event use a parameter file when triggering the DAG with DAG hybrid airflow dag catchup False default args default args schedule interval None as dag cloudquery ECSOperator task id cloudquery dag dag cluster hybrid airflow cluster task definition apache airflow overrides launch type EC awslogs group ecs hybrid airflow awslogs stream prefix ecs Hybrid ELT TaskDef reattach True cloudqueryAnd that is it We have not defined a schedule we will just make this an on demand workflow for testing We save this the code is in the repo and upload it into the Apache Airflow DAGs folder and in a few seconds the workflow should appear in the Apache Airflow UI Running our Workflow CloudWe can enable the workflow by unpausing it and then we can trigger it to test Once triggered Apache Airflow will submit this task to the scheduler which will queue the task via the executor to a MWAA worker node The task will execute on there running the ECS Task Definition using the parameters above These are the same that we used when we ran this via the command line After about minutes the task should complete The colour of the task in the Apache Airflow UI will change from light green running to dark green complete We can now view the log by clicking on the task and click on Log You will see something similar to the following I have omitted some of the log for brevity UTC taskinstance py INFO Starting attempt of UTC taskinstance py INFO UTC taskinstance py INFO Executing lt Task ECSOperator cloudquery gt on UTC standard task runner py INFO Started process to run task UTC standard task runner py INFO Started process to run task UTC standard task runner py INFO Job Subtask cloudquery UTC base aws py INFO Airflow Connection aws conn id aws default UTC base aws py INFO No credentials retrieved from Connection UTC base aws py INFO Creating session with aws access key id None region name eu west UTC base aws py INFO role arn is None UTC logging mixin py INFO Running lt TaskInstance hybrid airflow ec dag cloudquery manual T running gt on host ip eu west compute internal UTC taskinstance py INFO Exporting the following env vars AIRFLOW CTX DAG OWNER ubuntuAIRFLOW CTX DAG ID hybrid airflow ec dagAIRFLOW CTX TASK ID cloudqueryAIRFLOW CTX EXECUTION DATE T AIRFLOW CTX DAG RUN ID manual T UTC ecs py INFO Running ECS Task Task definition apache airflow on cluster hybrid airflow cluster UTC ecs py INFO ECSOperator overrides UTC ecs py INFO No active previously launched task found to reattach UTC ecs py INFO ECS Task started UTC ecs py INFO ECS task ID is afacefaeacabce UTC ecs py INFO Starting ECS Task Log Fetcher UTC ecs py INFO ECS Task has been successfully executed UTC taskinstance py INFO Marking task as SUCCESS dag id hybrid airflow ec dag task id cloudquery execution date T start date T end date T UTC local task job py INFO Task exited with return code UTC local task job py INFO downstream tasks scheduled from follow on schedule checkIf we look at the logs in our AWS CloudWatch group We can see the following Connecting to demords cidwsoyye eu west rds amazonaws com database demo as user adminDatabase host IP is Source IP is ip eu west compute internalQuery is select from customers WHERE country Spain Records exported Wiatt Revell wrevellq umn edu Female Spain Sheppard Rylett sryletthj java com Genderfluid Spain Sloane Maylour smaylourlq und de Female SpainSuccess We have now triggered our workflow via Apache Airflow The Task Definition ran on our EC instance that is in the Cloud we can see that from the Source IP and by checking this against the EC instance that is running containers for the ECS Cluster Now see how to trigger the task locally Running our Workflow LocalTo run our task on our local node all we need to do is change the launch type from EC to EXTERNAL When the ECS Cluster receives the request to run this task it will take a look at which nodes are running the ECS Anywhere agent and then select on to run the task on there remotequery ECSOperator task id remotequery dag dag cluster hybrid airflow cluster task definition apache airflow overrides launch type EXTERNAL awslogs group ecs hybrid airflow awslogs stream prefix ecs Hybrid ELT TaskDef reattach True remotequeryWhen we create a new DAG that does this and upload it the code is in the repo we can trigger this in exactly the same way Once triggered like before we need to wait minutes Once finished we can check the Logs within the Apache Airflow UI or via the CloudWatch log stream Connecting to demords cidwsoyye eu west rds amazonaws com database demo as user adminDatabase host IP is Source IP is ip eu west compute internalQuery is select from customers WHERE country Spain Records exported Wiatt Revell wrevellq umn edu Female Spain Sheppard Rylett sryletthj java com Genderfluid Spain Sloane Maylour smaylourlq und de Female SpainWe can see that this time the source IP is our local machine where we have the ECS Anywhere agent running Running our Workflow Local and accessing local resourcesIn the previous example we were still accessing the Cloud based MySQL database Connecting to demords cidwsoyye eu west rds amazonaws com but what we really want to do is to connect to our local resources when orchestrating these kinds of hybrid tasks Lets do that now If you can recall the ETL script uses parameters stored in AWS Secrets you could use a different repository if you wanted to know which MySQL database to connect to We can see this in the command we pass into the containerised ETL script command ricsue airflow hybrid period region data csv select from customers WHERE country Spain rds airflow hybrid eu west The value of rds airflow hybrid points to an AWS Secrets record that stores the database host database username and password At the beginning of this walkthrough we created another record that points to our local MySQL database which is localmysql airflow hybrid so we can create a new task that looks like the following with DAG hybrid airflow dag catchup False default args default args schedule interval None as dag localquery ECSOperator task id localquery dag dag cluster hybrid airflow cluster task definition apache airflow overrides containerOverrides name Hybrid ELT TaskDef command ricsue airflow hybrid period region data csv select from customers WHERE country Spain localmysql airflow hybrid eu west launch type EC awslogs group ecs hybrid airflow awslogs stream prefix ecs Hybrid ELT TaskDef reattach True localqueryAs before after creating this new DAG and uploading it the code is in the repo we can trigger this in exactly the same way Once triggered like before we need to wait minutes Once finished we can check the Logs within the Apache Airflow UI or via the CloudWatch log stream Connecting to localmysql beachgeek co uk database localdemo as user rootDatabase host IP is Source IP is ip eu west compute internalQuery is select from customers WHERE country Spain Records exported Dag Delacourt ddelacourtj nydailynews com Male SpainWe can see that the source IP is our local machine where we have the ECS Anywhere agent running AND we are now connecting to our local MySQL instance Note The reason why we have a different set of results this time is that we are using a different set of sample data In all the previous queries we were running them against the Amazon RDS MySQL instanceCongratulations We have now orchestrated running a local ETL script via ECS Anywhere accessing local resources and uploaded the results back up to our data lake on Amazon S Doing more with your WorkflowNow that you have the basics you can extend this and play around with how you can create re usable workflows Some of the things you can do include Over ride the default ETL script parameters we defined in the Task Defintion for example we can change the query or any of the other parameters for our ETL scriptwith DAG hybrid airflow dag catchup False default args default args schedule interval None as dag cloudquery ECSOperator task id cloudquery dag dag cluster hybrid airflow cluster task definition apache airflow overrides containerOverrides name airflow hybrid demo command ricsue airflow hybrid period region data csv select from customers WHERE country Poland rds airflow hybrid eu west launch type EC awslogs group ecs hybrid airflow awslogs stream prefix ecs Hybrid ELT TaskDef reattach True cloudqueryCreate new Task Defintions using the PythonOperator and boto to create a new Task Definition and then run it via the ECSOperator This is an example of how you would do that from airflow import DAGfrom datetime import datetime timedeltafrom airflow providers amazon aws operators ecs import ECSOperatorfrom airflow operators python import PythonOperatorimport botodefault args owner ubuntu start date datetime retry delay timedelta seconds def create task client boto client ecs region name eu west response client register task definition containerDefinitions name Hybrid ELT TaskDef image dkr ecr eu west amazonaws com hybrid airflow airflw cpu portMappings essential True environment mountPoints volumesFrom command ricsue airflow hybrid period hq data csv select from customers WHERE country England rds airflow hybrid eu west logConfiguration logDriver awslogs options awslogs group ecs hybrid airflow awslogs region eu west awslogs stream prefix ecs Hybrid ELT TaskDef taskRoleArn arn aws iam role hybrid airflow ECSTaskDefRole executionRoleArn arn aws iam role ecs anywhere taskdef hybridairflowApacheAirflowTas QLKRWWCUTD family apache airflow networkMode host requiresCompatibilities EXTERNAL cpu memory print response with DAG hybrid airflow dag taskdef catchup False default args default args schedule interval None as dag create taskdef PythonOperator task id create taskdef provide context True python callable create task dag dag cloudquery ECSOperator task id cloudquery dag dag cluster hybrid airflow cluster task definition apache airflow overrides launch type EC awslogs group ecs hybrid airflow awslogs stream prefix ecs Hybrid ELT TaskDef reattach True create taskdef gt gt cloudquery ConclusionWhat did we learn In this walk through we saw how we can take some of the steps you might typically use when creating a data pipeline containerise those and then create workflows in Apache Airflow that allow you to orchestrate hybrid data pipelines to run where ever we need them in the Cloud or in a remote data centre or network How can you improve this There are lots of ways you could use an approach like this and I would love to hear from you I would love to get your feedback on this post What did you like what do you think needs improving If you could complete this very very short survey I will use this information to improve future posts Many thanks Before you go make sure you think about removing cleaning up the AWS resources and local ones you might have setup Cleaning upIf you have followed this walk through then before leaving make sure you remove delete any resources you created to ensure you do not keep any recurring costs Review and clean up the following resources you may have provisioned as you followed alongDelete the DAGs that run the hybrid workflowDelete the Amazon ECS cluster and tasksDelete the Amazon ECR container repository and imagesDelete the files you have copied to the Amazon S bucketDelete the sample customer databases and if you created them MySQL instances Uninstall the ECS Anywhere agent and purge any local container imagesDelete any CloudWatch log groups that were created TroubleshootingAs with all of my blogging adventures I often come across things that I did not expect and plenty of mistakes I make which will hopefully save you the time So here is a few of the gotcha s I found during the setting up of this Troubleshooting ECS Anywhere as mentioned else where you can run into some issues which I have comprehensively outlined in another blog post Permissions as I was putting this walk through together I encountered many IAM related permissions issues This is because I do not want to start off with broad access and resource privileges which means trial and error to identify the needed permissions Here are some of the errors that I encountered that were permissions related These have all been incorporated in the CDK stack above and is included for reference Apache Airflow Worker TasksWhen using the ECSOperator I ran into permissions when trying to execute the ECS Operator task This is the error I encountered botocore errorfactory AccessDeniedException An error occurred AccessDeniedException when calling the RunTask operation User arn aws sts assumed role AmazonMWAA hybrid demo kwZCZS AmazonMWAA airflow is not authorized to perform ecs RunTask on resource arn aws ecs eu west task definition airflow hybrid ecs task because no identity based policy allows the ecs RunTask action UTC taskinstance py INFO Marking task as FAILED dag id airflow dag test task id airflow hybrid ecs task query execution date T start date T end date T UTC standard task runner py ERROR Failed to execute job for task airflow hybrid ecs task queryI needed to amend my MWAA Execution policy to enable these permissions This is the policy that I added Version Statement Sid VisualEditor Effect Allow Action ecs RunTask ecs DescribeTasks Resource Action iam PassRole Effect Allow Resource Condition StringLike iam PassedToService ecs tasks amazonaws com And I needed to update the Trust relationship of the MWAA Execution role as follows Version Statement Sid Effect Allow Principal Service ecs tasks amazonaws com Action sts AssumeRole When using boto to create a new Task Definition we encountered another error botocore errorfactory AccessDeniedException An error occurred AccessDeniedException when calling the RegisterTaskDefinition operation User arn aws sts assumed role AmazonMWAA hybrid demo kwZCZS AmazonMWAA airflow is not authorized to perform ecs RegisterTaskDefinition on resource because no identity based policy allows the ecs RegisterTaskDefinition action UTC taskinstance py INFO Marking task as FAILED dag id airflow ecsanywhere boto task id create taskdef execution date T start date T end date T UTC standard task runner py ERROR Failed to execute job for task create taskdefWe just needed to add a further ECS action ecs RegisterTaskDefinition One of the features of the ECSOperator is the attach True switch Enabling this generated permissions error botocore errorfactory AccessDeniedException An error occurred AccessDeniedException when calling the DescribeTaskDefinition operation User arn aws sts assumed role AmazonMWAA hybrid demo kwZCZS AmazonMWAA airflow is not authorized to perform ecs DescribeTaskDefinition on resource because no identity based policy allows the ecs DescribeTaskDefinition action UTC taskinstance py INFO Marking task as FAILED dag id hybrid airflow dag test task id cloudquery execution date T start date T end date TAdding the following additional permissions resolve this ecs DescribeTaskDefinition ecs ListTasks Updating the Trust relationships on the ECS Cluster Task Execution role to resolve a number of not allowed to pass Role error messages Edit the Trust Relationship of the ECS Execution Role so that the MWAA Workers are allowed to execute these tasks on our behalf Version Statement Sid Effect Allow Principal Service ecs tasks amazonaws com Action sts AssumeRole and change to add the Apache Airflow service which will allow the workers schedulers to kick this off Version Statement Sid Effect Allow Principal Service airflow env amazonaws com airflow amazonaws com ecs tasks amazonaws com Action sts AssumeRole The workflow ran into permissions issues when copying the files to the Amazon S bucket I needed to amend my MWAA Execution policy so I created a new policy that allows you to copy files to the S bucket and attached it This is what it looked like the Resource will be different depending on where you want to copy your files to Version Statement Sid VisualEditor Effect Allow Action s PutObject s GetObject s ListBucket s DeleteObject Resource arn aws s ricsue airflow hybrid The workflows were not generating any CloudWatch logs and this was because I had not added the correct permissions To fix this I amended the policy to include and match the AWS CloudWatch logging group that was created during the ECS Cluster creation and also make it so that we can do the necessary logging Effect Allow Action logs CreateLogStream logs CreateLogGroup logs PutLogEvents logs GetLogEvents logs GetLogRecord logs GetLogGroupFields logs GetQueryResults Resource arn aws logs log group ecs hyrid airflow log stream ecs Setting up the MySQL test dataHow I set up my MySql database machine locally UbuntuThis is the procedure I used I assume you have some basic knowledge of MySQL This is ok for a test demo setup but absolutely not ok for anything else As root user you will need to run the following commands sudo apt update sudo apt install mysql server sudo mysql secure installationFollow the prompts to enable local access I did have to run sudo mysql uroot p to login in order to be able to setup permissions Create mysql users and update permissions so it can connect query databases I create a user called admin which has access to all databases from any hosts On the machine you have just install MySQL update bind to on etc mysql mysql conf d mysqld cnf and then sudo systemctl restart mysql Update your local etc hosts to add the host name for the local database that will be used by the ETL script to connect to You use this value when defining the connection details in AWS Secrets We use etc hosts as this is a simple demo setup I used localmysql beachgeek co uk in my setup Test connectivity to that and make sure it resolves to Now login to MySQL and create your demo database and then create a table using the followingcreate database localdemo use localdemo create table customers id INT date DATE first name VARCHAR last name VARCHAR email VARCHAR gender VARCHAR ip address VARCHAR country VARCHAR consent VARCHAR Copy the sample customer data in the repo and you can now import the data using the following command gt mysql u user p localdemo lt customer reg sqlHow I set up my MySql database machine locally Amazon Linux A similar process was used to install MySQL on Amazon Linux Run the following commands to install and configure MySQL sudo yum install sudo amazon linux extras install epel ysudo rpm import sudo yum install mysql community serversudo systemctl enable now mysqldYou will now need to change the root password To find the current temporary password run sudo grep temporary password var log mysqld log which will show you the password Something like this you will see T Z Note MY Server A temporary password is generated for root localhost F lFfe QGqXAnd then set new root password withsudo mysql secure installation p F lFfe QGqX Update your local etc hosts to add the host name for the local database that will be used by the ETL script to connect to You use this value when defining the connection details in AWS Secrets We use etc hosts as this is a simple demo setup I used localmysql beachgeek co uk in my setup Test connectivity to that and make sure it resolves to Now login to MySQL and create your demo database and then create a table using the followingcreate database localdemo use localdemo create table customers id INT date DATE first name VARCHAR last name VARCHAR email VARCHAR gender VARCHAR ip address VARCHAR country VARCHAR consent VARCHAR Copy the sample customer data in the repo and you can now import the data using the following command gt mysql u user p localdemo lt customer reg sql |
2022-03-07 11:40:52 |
海外TECH |
DEV Community |
React Sticky Children |
https://dev.to/imkevdev/react-sticky-children-2808
|
React Sticky ChildrenI created a simple ReactJS plugin react sticky children to abstract away the complexities of the IntersectionObserver API and allow you to apply styles to a component as it approaches the top of the viewport Useful for Scroll to Top complex sticky elements or animating components into view Demo Usageimport ReactStickyChildren from react sticky children lt ReactStickyChildren initialStyle initialStyle intersectingStyle intersectingStyle gt lt MyComponent gt lt ReactStickyChildren gt Looking for feedback NPM GitHub |
2022-03-07 11:07:29 |
海外TECH |
DEV Community |
Literals, Variables and Constants in C++ |
https://dev.to/thenerdydev/literals-variables-and-constants-in-c-52lc
|
Literals Variables and Constants in C Variables Literals and Constants in C In this tutorial let us learn about variables literals and constants in C with help of some examples So let us learn first what a variable is in C C VariablesIn programming a variable is defined as a container that is used to store data So it is used to store some information that we can later on reference and manipulate in our code So to create a variable we first need a unique name for the variable This unique name for the variable is also called as the identifier So let us see some examples on how we can create and initialize variables in C Here age is a variable of the int data type and to it we have assigned a value of Next we have b which is a variable of the char data type and to it we have assigned some character data which in this case is a Note One thing to note here is that a variable with the int data type can only hold integers Similarly a variable with the char data type can only hold data of character type If we want to store decimals and exponentials we can make use of the double data type Another thing to note here is that the variables are mutable in the sense that you can reassign the variable to different values and hence this implies that you can change the content of a variable Let us also see an example for this Rules for naming a variableA variable name can only have alphabets numbers and the underscore A variable name cannot begin with a number It is a preferred practice to begin variable names with a lowercase character For example name is preferable to Name A keyword cannot be used as a variable name For example a keyword like int cannot be used as a variable name A variable name can start with an underscore Also ensure that you give meaningful names to your variables which semantically makes sense C LiteralsLiterals are data used for representing fixed values They can be used directly in the code So in essence these are source code representation of a fixed value or a sequence of characters which represents a constant value that is to be stored inside a variable For example z etc Now these are called literals because you cannot assign a different value to these terms Here is a list of different literals that we have in C Programming IntegersAccording to Wikipedia An integer is a datum of integral data type a data type that represents some range of mathematical integers Integral data types may be of different sizes and may or may not be allowed to contain negative values Integers are commonly represented in a computer as a group of binary digits bits This means that an integer is a numerical literal that does not have any fractional or exponent within it There are three types of integer literals in C Programming decimal base octal base hexadecimal base For example Decimal etcOctal etcHexadecimal xf x etcA thing to note here is that an octal starts with a and a hexadecimal starts with a x Floating point LiteralsFloating point literals are numbers that have a decimal point or an exponential part They can be represented as Real literals Binary floating point literals E Note E CharactersA character literal is a type of literal in programming for the representation of a single character s value within the source code of a computer program It is represented by enclosing a single character inside single quotes For example b f G etc Escape Sequences An escape sequence is a combination of characters that has a meaning other than the literal characters contained therein it is marked by one or more preceding charactersHere is a lit of escape sequences for the given characters Escape Sequences Characters b Backspace f Form feed n Newline r Return t Horizontal tab v Vertical tab Backslash Single quotation mark Double quotation mark Question mark Null Character String LiteralsA string literal is a sequence of characters enclosed in double quote marks For example This is a string create a string empty string string with a whitespace in quotes x a single character enclosed within quotes This is a string n prints a string with a newlineWe will also learn about Strings in a separate article as well Let us now move forward to constants in C C ConstantsThe value of a constant never changes once defined so kind of it is immutable and cannot be re assigned once a value has been fixed for a constant See the above example here we have used the const keyword to declare a constant which we have named PI Now if we try to mutate or change the value of PI we will get an error because we cannot reassign a variable that we have declared using the const keyword If we do so we will get an error We can also create a constant using the define preprocessor directive something that we will learn about in a separate article |
2022-03-07 11:02:26 |
ニュース |
BBC News - Home |
Ukraine conflict: Shares slide as oil and gas prices soar |
https://www.bbc.co.uk/news/business-60642786?at_medium=RSS&at_campaign=KARANGA
|
bills |
2022-03-07 11:32:38 |
ニュース |
BBC News - Home |
Ukraine war: PM to hold talks with world leaders on further sanctions |
https://www.bbc.co.uk/news/uk-60642926?at_medium=RSS&at_campaign=KARANGA
|
policy |
2022-03-07 11:45:52 |
ニュース |
BBC News - Home |
One dead after Scottish trawler capsizes off Norway |
https://www.bbc.co.uk/news/uk-scotland-60645898?at_medium=RSS&at_campaign=KARANGA
|
members |
2022-03-07 11:47:58 |
ニュース |
BBC News - Home |
Shane Warne: Australian cricket legend died from natural causes - police |
https://www.bbc.co.uk/news/world-asia-60645939?at_medium=RSS&at_campaign=KARANGA
|
samui |
2022-03-07 11:26:47 |
北海道 |
北海道新聞 |
世界コロナ死者600万人 変異株流行もペース鈍化 |
https://www.hokkaido-np.co.jp/article/653988/
|
集計 |
2022-03-07 20:08:00 |
北海道 |
北海道新聞 |
30年五輪前の新幹線札幌延伸に高いハードル 工事大幅遅れ |
https://www.hokkaido-np.co.jp/article/653981/
|
遅れ |
2022-03-07 20:01:59 |
コメント
コメントを投稿