投稿時間:2021-06-12 17:27:05 RSSフィード2021-06-12 17:00 分まとめ(28件)

カテゴリー等	サイト名等	記事タイトル・トレンドワード等	リンクURL	頻出ワード・要約等/検索ボリューム	登録日
IT	MOONGIFT	Trombone.js - Web上で奏でるトロンボーン	http://feedproxy.google.com/~r/moongift/~3/rFs13KPX-48/	trombonejs	2021-06-12 17:00:00
python	Pythonタグが付けられた新着投稿 - Qiita	Pythonでフォルダのファイル名を小文字（大文字）にする方法	https://qiita.com/miiitaka/items/5bfd42d5363874147c57	Pythonでフォルダのファイル名を小文字大文字にする方法フォルダに格納されているファイルが大文字小文字混在のファイル名だったので、これを小文字に統一したくPythonでちょこっとプログラムをかいたのでメモ。	2021-06-12 16:47:00
python	Pythonタグが付けられた新着投稿 - Qiita	Python 有理数のp進展開を計算する	https://qiita.com/quryu/items/e0a118788a90fb43aadd	「 SageMath 」をクリックします。	2021-06-12 16:11:23
js	JavaScriptタグが付けられた新着投稿 - Qiita	【アルゴリズム】JavaScriptで文字列問題を解く	https://qiita.com/suzuki0430/items/995df487307cd329d90e	オブジェクトを作成したら、各文字キーの出現回数とmaxを比較していき、maxより大きい値がでたらmaxを更新していきます。	2021-06-12 16:52:10
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	PDO drivers no value の解決法	https://teratail.com/questions/343631?rss=all	PDOdriversnovalueの解決法使用しているバージョンWindowsMySQLnbspApachenbspPHPnbsp問題点MySQLのデータベースとnbspPHPが接続されていることを確認する為に、テスト用のphpファイルを作成しブラウザで表示させようとしたところ、『couldnbspnotnbspfindnbspdriver』という表記が出ました。	2021-06-12 16:57:04
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	Unity　アイテムの取得の可視化について	https://teratail.com/questions/343630?rss=all	Unityアイテムの取得の可視化についてUnityでDのアドベンチャーゲームを作っているのですが、昔のポケモンでアイテムを拾う時のように対象オブジェクトに近づいてボタンを押すことでアイテムを取得し、取得したらインベントリウィンドウに表示させるようにしたいです。	2021-06-12 16:49:09
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	二分木での検索について	https://teratail.com/questions/343629?rss=all	二分木での検索について前提・実現したいこと実行して結果が全てTrueになることを確認する、これをうごかしたいですrootBintreerootinsertthreerootinsertfiverootinserteightprintrootlessThankeyprintrootlessThankeyprintrootlessThankeyprintrootlessThankeyprintrootlessThankeyprintrootlessThankeyprintrootlessThanNoneprintrootlessThanNoneここに質問の内容を詳しく書いてください。	2021-06-12 16:46:39
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	RailsアプリのDuplicateTableエラーが解決できない	https://teratail.com/questions/343628?rss=all	RailsアプリのDuplicateTableエラーが解決できない問題発生までの流れRailsアプリにPry送信パラメータの確認を導入するため、Gemファイルにpryrailsを加えbundlenbspinstallし、ローカルサーバーを再起動しました。	2021-06-12 16:42:51
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	PostgreSQLのデータの保存場所・方法について	https://teratail.com/questions/343627?rss=all	PostgreSQLのデータの保存場所・方法についてPostgreSQLのデータの保存について質問です。	2021-06-12 16:34:30
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	Docker上でのanacondaとVScodeの紐付けの仕方について(ubuntu)	https://teratail.com/questions/343626?rss=all	Docker上でのanacondaとVScodeの紐付けの仕方についてubuntu閲覧頂きありがとうございます。	2021-06-12 16:29:01
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	なぜ重みの初期値にnp.random.randnを用いるのか？	https://teratail.com/questions/343625?rss=all	なぜ重みの初期値にnprandomrandnを用いるのか質問ゼロから作るディープラーニングを勉強していて、pのところを読んでいたら、そもそもなんで、重みWの初期値を今までnprandomrandnというものにしていたのかがわからなくなりました。	2021-06-12 16:28:05
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	(カスタムデータ属性？)次のタグを抽出したい場合、どういうコードを書いたら良い？	https://teratail.com/questions/343624?rss=all	カスタムデータ属性次のタグを抽出したい場合、どういうコードを書いたら良い次のようなdivタグに「datakyujinlistitemquotquot」が加わったような形で構成されているコードがあります。	2021-06-12 16:19:28
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	低レイヤの勉強方法について	https://teratail.com/questions/343623?rss=all	低レイヤの勉強方法について組み込み系未経験です。	2021-06-12 16:10:46
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	unity terrain使うと謎のエラーが出る	https://teratail.com/questions/343622?rss=all	unityterrain使うと謎のエラーが出るunityでTerrainを使ったときにInvalidnbspeditornbspwindownbspUnityEditorFallbackEditorWindowというエラーが時々出ます。	2021-06-12 16:03:14
Program	[全てのタグ]の新着質問一覧｜teratail（テラテイル）	Wordpressのphpの関数の出力がわかりません。	https://teratail.com/questions/343621?rss=all	Wordpressテーマ内のカスタム投稿タイプabcのためのファイルarchiveabcphpを編集しようとしているのですが、phpの関数の出力がうまくいきません。	2021-06-12 16:02:32
AWS	AWSタグが付けられた新着投稿 - Qiita	ピアソンVUEでAWS試験をオンライン受験してみた（事前確認編）	https://qiita.com/yoshinori-takeuchi/items/f7c6c1d782fb010f04f4	試験中に第三者が入室してはならないことに注意してください。	2021-06-12 16:52:12
GCP	gcpタグが付けられた新着投稿 - Qiita	【GCP】組織ポリシー VS Dataflow	https://qiita.com/sabawanco/items/edbbcf81774ae7f32b98	まとめ外部IPに関する組織ポリシーを適用するとデフォルト設定のDataflowは失敗します。	2021-06-12 16:53:20
海外TECH	DEV Community	Sharing a Custom Report will share the report configuration and data included in the report	https://dev.to/tesla91615060/sharing-a-custom-report-will-share-the-report-configuration-and-data-included-in-the-report-3n9g	Sharing a Custom Report will share the report configuration and data included in the reportQuestion Sharing a Custom Report will share the report configuration and data included in the report Answer TRUE FALSEFind all other answer related to this certification at Google Analytics Individual Qualification Exam Answers What is Google Analytics Individual Qualification GAIQ Explanation Summary Assets are tools that you create in Analytics to help you customize your data analysis Custom Segments Goals Custom Channel Groupings Custom Attribution Models and Custom Reports are all considered assets in Analytics Assets are created and managed at the reporting view level You can share or delete assets one at a time from each specific asset menu or you can share or delete them in bulk When you share an asset only the configuration information is copied and shared None of your Analytics account data or personal information is shared so you maintain control over your privacy and data Source Sharing a Custom Report will share the report configuration and data included in the report	2021-06-12 07:50:32
海外TECH	DEV Community	User Defined Snippets in VSCode.	https://dev.to/kpalaw/user-defined-snippets-in-vscode-561l	User Defined Snippets in VSCode This is a tutorial for anyone who needs a DIY snippets tool in VSCode In this tutorial we will demonstrate the snippet tool for html but wecan use this method for any file type in VSCode How to generate a template Generating a template is a step process File gt Preference gt User Snippets Type “html into the navigation bar Get an html json template in config Code User Snippets which has to be configured Figure Structure of html json format Text that Appears on Dropdown List prefix xxx yyy body “Text that will be inserted to source file description Text xxxyyyzzz Figure The text that appears in the dropdown list How to config html jsonFirst “Text that Appears on Dropdown List will be displayed on thedropdown list as shown in Figure when we type the first character of “prefix There are other key items that need to be filed prefix and “body The “prefix is a word or an abbreviation that we type in and whenever we hit Enter the content in “body will be inserted to the source file automatically Let s look at Figure Ex prefix athtml means after we type“athtml in source file the snippet of the HTML boilerplate will beinserted immediately The “prefix can be a single word like Ex or itcan be two words or more like Ex where prefix divc atdivc thatmeans we can type “divc or “atdivc to get the div tag withclass className In “body we can put static content static content with a placeholderor dynamic content content plus a few variables For example Ex className is the placeholder where we can edit the content afterthe snippet is inserted into the source file It is possible that thereare many placeholders on the content inserted such as in the case of Exwhere there are placeholders the cursor will stop at firstplaceholder and we can use TAB to jump to the second placeholder You can copy Figure and paste it into your own html json for testingor add edit your own snippet if you want Figure Example of HTML snippets html json Ex Type “athtml to get HTML boilerplate HTML boilerPlate prefix athtml body lt DOCTYPE html gt lt html lang en gt lt head gt lt meta charset UTF gt lt title gt atHTML lt title gt lt style gt lt style gt lt head gt lt body gt lt script gt lt script gt lt body gt lt html gt description Paste boilerPlate Snippet to HTML file Ex Type “divc or “atdivc to get div tag with class xxxx HTML lt div class yy prefix divc atdivc body lt div class className gt lt div gt description Paste div tag Ex Type “divi or “atdivi to get div tag with id xxxx HTML lt div id xx prefix divi atdivi body lt div id id gt lt div gt description Paste div tag Ex Type “divic or “atdivc to get div tag with id xxxx and class yyyy HTML lt div id xx class yy prefix divic atdivic body lt div id id class className gt lt div gt description Paste div tag If you need a complete reference please read the manual s instructions for user s defined Snippet at	2021-06-12 07:46:51
海外TECH	DEV Community	GOOGLE ANALYTICS INDIVIDUAL QUALIFICATION EXAM ANSWERS 2021	https://dev.to/tesla91615060/google-analytics-individual-qualification-exam-answers-2021-15il	GOOGLE ANALYTICS INDIVIDUAL QUALIFICATION EXAM ANSWERS What is Google Analytics Individual Qualification GAIQ The Google Analytics Individual Qualification exam is a certification exam conducted by Google through the Skill shop learning platform This course teaches and then tests your basic and advanced concepts of the Google Analytics platform After completion of the course If you pass the exam You are awarded a certificate The Google Analytics exam provides you the assignments from both Google Analytics Beginners and Advanced courses before attempting this exam How long does Google Analytics certification takeAccording to Google you need Hours to go through each module You have minutes or hour minutes to complete the exam How long does Google Analytics certification lastGoogle Analytics Individual Qualification exam is valid for year from the date of completion How to add Google Analytics certification to LinkedInThere are options to add Google Analytics certification to your LinkedIn profile Download the Certificate and then upload It to your LinkedIn profileThere is an option to share your certificate with your social media handle You can use that to share it with your LinkedIn How much does Google Analytics certification costGoogle Analytics certification is Free of Cost Please do not believe the website which says get google analytic course at nominal prices How to verify Google Analytics certificationTo verify the Google analytic certificate One can provide them with their link which Google provides as the certificate is on the Google website safely secure One can confirm the authentication of the certificate Source GOOGLE ANALYTICS INDIVIDUAL QUALIFICATION EXAM ANSWERS	2021-06-12 07:44:14
海外TECH	DEV Community	PHP 8 c'est du sérieux ? Devriez-vous l'apprendre ?	https://dev.to/ericlecodeur/php-8-c-est-du-serieux-devriez-vous-l-apprendre-54a	PHP c x est du sérieux Devriez vous l x apprendre Si vous désirez plus de contenu francophone comme celui ci cliquer Follow ou suivez moi sur TwitterQu est ce que PHPPHP est un langage de programmation généraliste et Open Source spécialement conçu pour le développement d applications web PHP c est sérieux PHP est le language de programmation serveur le plus utiliséprésentement sur le web Un peu moins de des sites webs ont du code PHP C est énorme PHP n est pas près de disparaitre Pourquoi PHP vs les autresCe points est plus subjectif mais pour moi PHP est un language mature solide et complet PHP est présentement àla version et contient tout ce qu un language moderne doit avoir PHP a beaucoup évoluédans les dernières années L ajout de l orientéobject et plusieurs autres concepts moderne ont changédramatiquement la façon de coder et la qualitédes applications web PHP PHP est maintenant utilisépar des millions de sites web Incluant Facebook Wordpress Wikipedia Tumblr Slack et bien d autres La communautéPHP est très grande très dynamique et inclusive Tout le monde est le bienvenue Il n y a pas de snobisme c est une vrai communautéd entraide PHP peut également compter sur des frameworks de haute qualitécomme Symfony et Laravel et plusieurs autres Ces frameworks permettent de développer des applications web rapidement et surement Par exemple Laravel permet de développer une application web en PHP moderne orientéobject Laravel vous fournis des outils et librairies qui ont faites leurs preuves et qui peuvent vous aider a coder des applications de A àZ rapidement et efficacement Laravel permet également de créer des applications web full stack Autant frontend que backend Ce qui permet de faciliter et d accélérer le développement d une application complète Bref avec PHP vous pouvez réaliser de petits gros et très gros projets PHP peut gérer quelques clics par jour ou des millions de clics par jours Enfin oui PHP c est du sérieux et oui PHP vaut la peine d être utiliséet maitrisé ConclusionC est tout pour aujourd hui Demain l aventure continu avec encore et toujours plus de publications en français Pour ne rien manqué cliquer Follow ou suivez moi sur Twitter	2021-06-12 07:25:32
海外TECH	DEV Community	Data Processing in AWS Sagemaker	https://dev.to/aws-builders/data-processing-in-aws-sagemaker-20gi	Data Processing in AWS SagemakerData processing is one of the first steps of the machine learning pipeline As different sources of data have different formats it becomes almost impossible to handle all the formats inside the model Hence we give the data a synchronous structure and then we try to process different unwanted sections of it These sections include the null values outliers dummification of categorical columns standardization of numerical columns etc We can use SageMaker effectively to process the data in all these domains Preprocessing in Jupyter NotebookIn between receiving the raw data and feeding the data to the model there are a lot of steps the data goes through These steps are the data processing steps Data processing includes feature selection feature transformation feature imputation feature normalization etc Once all these steps are done we proceed to splitting the data into a training set and a validation set which are finally given to the model In this section we will be looking at some of the basic data processing steps that we can follow Loading the raw dataImputing the null values which means how to replace the null values with some actual values Splitting the data into categorical and numerical data frames“Dummifying categorical dataImputing the remaining null valuesConcatenating the categorical and numerical data framesNormalizing the final data frameSplitting the data into train and validation setsThis part assumes that you have hands on knowledge of Pandas Numpy and Scikit Learn These packages are required for the data processing steps If not then it is recommended that you explore these packages to get some hands on experience before moving on to learning SageMaker The dataset that we will be using for processing is the Big Mart sales dataset which can be downloaded from Kaggle at www kaggle com devashish big mart sales prediction This dataset contains a lot of information related to the sales of items in a retail shop The task is to predict the sales of items We will not be looking at the prediction part in this chapter Rather we ll be exploring only the data processing part of the process Let s start by reading the train file using the Pandas framework import pandas as pddata pd read csv Train csv Now the entire CSV sheet s columns are saved in a data frame object named data Next let s explore the top five rows of the dataset data head print data shape print print data columns As we can see there are rows and columns Also we can see the names of all the columns in the list given As we have seen in the steps of processing the next step is to impute the null values So let s take a look at all the columns that have null values data isna sum So there are two columns with null values Item Weight and Outlet Size We can use the normal imputation methods provided by Scikit Learn to impute these null values But instead we will be using the help of nearby columns to fill in these null values Let s look at the data types of these columns as that is going to help us in making imputation strategies print data Item Weight dtype print data Outlet Size dtype The output shows that the Item Weight column is a float while the Outlet Size column is categorical or an object What we will do next is to first split the data into numerical and categorical data frames and then impute the null values import numpy as np cat data data select dtypes object num data data select dtypes np number Now we have all the categorical columns in cat data We can check for the presence of null values again cat data isna sum So the null value still exists If we look at the categories present in the columns we will see there are three pythoncat data Outlet Size value counts We will do anohter thing before moving on to dummification If we look at the categories of the Item Fat Content column cat data Item Fat Content value counts LF means Low Fat reg means Regular and low fat is just the lowercase version of Low Fat Let s rectify all of this cat data loc cat data Item Fat Content LF Item Fat Content Low Fat cat data loc cat data Item Fat Content reg Item Fat Content Regular cat data loc cat data Item Fat Content low fat Item Fat Content Low Fat Next let s apply label encoding on the categorical data frame We will use the package for this pyfrom sklearn preprocessing import LabelEncoderle LabelEncoder cat data cat data apply le fit transform cat data head We will concatenate the two data frames categorical and numerical and then normalize the columns Also we will remove two of the columns before that one inItem Identifier and the second in Item Sales Item Identifier is not really an important column while Item Sales will be our dependent variable hence it cannot be in the independent variables list from sklearn preprocessing import StandardScalerss StandardScaler num data pd DataFrame ss fit transform num data drop Item Outlet Sales axis columns num data drop Item Outlet Sales axis columns cat data pd DataFrame ss fit transform cat data drop Item Identifier axis columns cat data drop Item Identifier axis columns final data pd concat num data cat data axis final data head Now we have our final data ready We have used a standard scaler class to normalize all the numerical values to their z scores We will be using final data as independent variables while we will extract Item Sales as dependent variables X final datay data Item Outlet Sales The last step is to get our training and validation sets For this we will use the class model selection provided by Scikit Learn We will take percent of our data as a validation set while remaining as a test set from sklearn model selection import train test splitX train X test y train y test train test split X y test size random state This marks the last step of data processing Now we can use it to train any kind of model that we want The code lines that I have shown can be executed in any Jupyter Notebook either in the localhost or in the cloud The only requirement is that the necessary packages must be installed In the next section I will show you how to run the same code in SageMaker using the Scikit Learn container provided by the SageMaker service The script remains the same but the process changes as we have to continuously talk with the S bucket and define the instances as well Preprocessing Using SageMaker s Scikit Learn ContainerWe use SageMaker to take advantage of multiple things especially the computation power API generation and ease of storage Therefore to achieve these things the code must be written in a specific format We will use the same code that we saw in the previous section but we ll make some changes in the overall structure so that it becomes compatible with SageMaker First the data should be in the S bucket We have already put our Train csv file in the bucket in the first section of this chapter Once that is done we can start writing our code First we will define the role of the user and the region in which we are using the SageMaker service import botoimport sagemakerfrom sagemaker import get execution roleregion boto session Session region namerole get execution role The Boot package tries to extract the region name automatically if we are using the SageMaker notebook If we are working from the localhost notebook then it needs to be custom defined We will look at that part in the last part of this book get execution role extracts the current role with which the user has signed in It can be the root user or IAM role Now that we have defined the region and role the next step will be to define our Scikit Learn container As mentioned in the first part of the book SageMaker operates on Docker containers All the built in algorithms are nothing but Docker containers and even the custom algorithm must be put inside the Docker container and uploaded to ECR Since we will be using Scikit Learn to process our data already SageMaker has a processing container for that We just need to instantiate it and then use it from sagemaker sklearn processing import SKLearnProcessorsklearn processor SKLearnProcessor framework version role role instance type ml m xlarge instance count In the previous code we created an object called SKLearnProcessor The parameters passed tell about the version of Scikit Learn to use the IAM role to be passed to the instance the type of compute instance to be used and finally the number of compute instances to be spinned up Once this is done any Python script that we write and that uses Scikit Learn can be used inside this container Now let s check whether our data is accessible from SageMaker import pandas as pdinput data s slytherins test Train csv df pd read csv input data df head is the name of the S bucket that we created earlier in the article Train csv is the data that we uploaded If everything works perfectly you ll get the output like this Now it s time to define our processing script that will be run inside the container We have already written this script in the previous part We will just restructure the code and save it inside a file named preprocessing py pythonimport argparseimport osimport warningsimport pandas as pdimport numpy as npfrom sklearn model selection import train test splitfrom sklearn preprocessing import StandardScaler LabelEncoderfrom sklearn exceptions import DataConversionWarningwarnings filterwarnings action ignore category DataConversionWarning Here we have defined all the columns that are present in our datacolumns Item Identifier Item Weight Item Fat Content Item Visibility Item Type Item MRP Outlet Identifier Outlet Establishment Year Outlet Size Outlet Location Type Outlet Type Item Outlet Sales This method will help us in printing the shape of our datadef print shape df print Data shape format df shape if name main At the time of container execution we will use this parser to define our train validation split Default kept is parser argparse ArgumentParser parser add argument train test split ratio type float default args parser parse known args print Received arguments format args This is the data path inside the container where the Train csv will be downloaded and saved input data path os path join opt ml processing input Train csv print Reading input data from format input data path data pd read csv input data path data pd DataFrame data data columns columns for i in data Item Type value counts index data loc data Item Weight isna amp data Item Type i Item Weight data loc data Item Type Fruits and Vegetables Item Weight mean cat data data select dtypes object num data data select dtypes np number cat data loc cat data Outlet Size isna amp cat data Outlet Type Grocery Store Outlet Size Small cat data loc cat data Outlet Size isna amp cat data Outlet Type Supermarket Type Outlet Size Small cat data loc cat data Outlet Size isna amp cat data Outlet Type Supermarket Type Outlet Size Medium cat data loc cat data Outlet Size isna amp cat data Outlet Type Supermarket Type Outlet Size Medium cat data loc cat data Item Fat Content LF Item Fat Content Low Fat cat data loc cat data Item Fat Content reg Item Fat Content Regular cat data loc cat data Item Fat Content low fat Item Fat Content Low Fat le LabelEncoder cat data cat data apply le fit transform ss StandardScaler num data pd DataFrame ss fit transform num data columns num data columns cat data pd DataFrame ss fit transform cat data columns cat data columns final data pd concat num data cat data axis print Data after cleaning format final data shape X final data drop Item Outlet Sales axis y data Item Outlet Sales split ratio args train test split ratio print Splitting data into train and test sets with ratio format split ratio X train X test y train y test train test split X y test size split ratio random state This defines the output path inside the container from where all the csv sheets will be taken and uploaded to S Bucket train features output path os path join opt ml processing train train features csv train labels output path os path join opt ml processing train train labels csv test features output path os path join opt ml processing test test features csv test labels output path os path join opt ml processing test test labels csv print Saving training features to format train features output path pd DataFrame X train to csv train features output path header False index False print Saving test features to format test features output path pd DataFrame X test to csv test features output path header False index False print Saving training labels to format train labels output path y train to csv train labels output path header False index False print Saving test labels to format test labels output path y test to csv test labels output path header False index False As we can see the previous code is the same all we have done is defined the place where the data will be stored inside the container and the place where the output will be stored and then uploaded to the S bucket from there Once this script is defined we are good to go now All we have to do is spin up the instantiated container pass this script as a parameter pass the data as a parameter pass the directory where output files will be stored and finally pass thedestination S bucket from sagemaker processing import ProcessingInput ProcessingOutputsklearn processor run code preprocessing py inputs ProcessingInput source input data destination opt ml processing input outputs ProcessingOutput output name train data source opt ml processing train destination s slytherins test ProcessingOutput output name test data source opt ml processing test destination s slytherins test arguments train test split ratio In the previous code we have passed all the parameters Also we have defined the argument that tells about the split percentage Inside the preprocessing py script we have code that parses this argument The processing job will take some time to finish It first launches an instance which is similar to booting up an operating system and then it downloads the sklearn image on the instance Then data is downloaded to the instance Then the processing job starts When the job finishes the training and test data is stored back to S Then the entire operation finishes Once the job is finished we can get detailed information about the job by using the following script preprocessing job description sklearn processor jobs describe Let s use this script to get the S bucket location of the training and test datasets output config preprocessing job description ProcessingOutputConfig for output in output config Outputs if output OutputName train data preprocessed training data output SOutput SUri if output OutputName test data preprocessed test data output SOutput SUri Now we can check the output by reading the data using Pandas training features pd read csv preprocessed training data train features csv nrows header None print Training features shape format training features shape training features head Creating Your Own Preprocessing Code Using ScriptProcessorIn the previous section we used SkLearnProcessor which is a built in container provided by SageMaker But many times we have to write some code that cannot only be executed in a SageMaker s predefined containers For that we have to make our own containers We will be looking at making our own containers while training a machine learning model as well In this section we will make a container that performs the same tasks as the SKlearnProcessor container The only difference is that it s not prebuilt we will build it from scratch To use custom containers for processing jobs we use a class provided by SageMaker named ScriptProcessor Before giving inputs to ScriptProcessor the first task is to create our Docker container and push it to ECR Creating a Docker ContainerFor this we will be creating a file named Dockerfile with no extension Inside this we will be downloading an image of a minimal operating system and then install our packages inside it So our minimal operating system will be Linux based and we will have Python Scikit Learn and Pandas installed inside it FROM python slim busterRUN pip install pandas scikit learn ENV PYTHONUNBUFFERED TRUEENTRYPOINT python The previous script must be present inside the Dockerfile The first line FROM python slim buster tells about the minimal operating system that needs to be downloaded from Docker Hub This only contains Python and the minimal packages required to run Python But we need to install other packages as well That s why we will use the next line RUN pip install pandas scikit learn This will install Pandas Scikit Learn Numpy and other important packages The next line ENV PYTHONUNBUFFERED TRUE is an advanced instruction that tells Python to log messages immediately This helps in debugging purposes Finally the last line ENTRYPOINT python tells about how our preprocessing py file should execute Building and Pushing the ImageNow that our Docker file is ready we need to build this image and then push it to Amazon ECR which is a Docker image repository service To build and push this image the following information will be required Account IDRepository nameRegionTag given to the imageAll this information can be initialized using the following script import botoaccount id boto client sts get caller identity get Account ecr repository sagemaker processing container tag latest region boto session Session region nameOnce we have this information we can start the process by first defining the ECR repository address and then executing some command line scripts processing repository uri dkr ecr amazonaws com format account id region ecr repository tag Create ECR repository and push docker image docker build t ecr repository docker This builds the image aws ecr get login region region registry ids account id no include email Logs in to AWS aws ecr create repository repository name ecr repository Creates ECR Repository docker tag ecr repository tag processing repository uri Tags the image to differentiate it from other images docker push processing repository uri Pushes image to ECRIf everything works fine then your image will successfully be pushed to ECR You can go to the ECR service and check the repository Using a ScriptProcessor ClassNow that our image is ready we can start using the ScriptProcessor class We will execute the same code inside this container Just like how we did in SKLearnProcessor we will create an object of the class first pyfrom sagemaker processing import ScriptProcessor ProcessingInput ProcessingOutputfrom sagemaker import get execution rolerole get execution role script processor ScriptProcessor command python image uri processing repository uri role role instance count instance type ml m xlarge input data s slytherins test Train csv script processor run code preprocessing py inputs ProcessingInput source input data destination opt ml processing input outputs ProcessingOutput source opt ml processing train destination s slytherins test ProcessingOutput source opt ml processing test destination s slytherins test You will find the code to be almost the same as the code It will give the same output as well Finally once the processing job is done we can check the output again in the same way pypreprocessing job description script processor jobs describe output config preprocessing job description ProcessingOutputConfig for output in output config Outputs if output OutputName output preprocessed training data output SOutput SUri if output OutputName output preprocessed test data output SOutput SUri import pandas as pdtraining features pd read csv preprocessed training data train features csv nrows header None print Training features shape format training features shape training features head n Using Boto to Run Processing JobsAs mentioned we use the Boto package to access the services of AWS from any other computer including your localhost So in this section we will be running the custom Docker container script that we saw in the previous section using Boto Installing BotoThe first step for using Boto is to install it inside the localhost environment Along with Boto we have to install awscli which will help us in authentication with AWS and sfs which in turn will help us in talking with the S bucket To install it we will be using pip as shown here pip install botopip install awsclipip install sfsOnce the installation finishes we need to configure the credentials of AWS For this we will run the following command aws configureThis will ask you for the following four inputs •AWS access key•AWS secret access key•Default region name•Default output formatOnce we provide this information we can easily use Boto to connect with the AWS services I have already shown you how to get the access key and secret access key when creating the IAM roles The default region name will be us east but you can recheck this by looking at the top right corner of your AWS management console It will tell you the location Once this part is done we can start our Jupyter Notebook local system notebook and create a notebook using the same environment inside which we have installed all the packages and configured AWS Initializing BotoInside the notebook the first step will be to initialize Boto For this we will use the following script import botoimport sfs region boto session Session region nameclient boto client sagemaker Making Dockerfile Changes and Pushing the ImageNow we will use the Boto API to call the processing job method This will create the same processing job that we saw in the previous section But minor changes will be required and we will explore them one by one We will use the method create processing job to run the data processing job To learn more about this method or all the methods related to SageMaker provided byBoto you can visit But before that we have to make some changes in our Docker container and our processing Python file For the Docker container we will need to copy ourpreprocessing py script inside it so that the Boto method can run the script directly For this we will make the following changes to our Dockerfile FROM python slim busterRUN pip install pandas scikit learn ENV PYTHONUNBUFFERED TRUEENV PATH opt ml code PATH COPY preprocessing py opt ml code preprocessing pyWORKDIR opt ml codeWe have added three new lines to our existing Dockerfile The line ENV PATH opt ml code PATH sets up the environment path to opt ml code We will be placing our script preprocessing py inside it with COPY preprocessing py opt ml code preprocessing py Finally we will be making our working directory the same folder WORKDIR opt ml code This is required so that the Docker container will know where the script file is present and it will help in its execution Once we have made changes in the Dockerfile we will make changes to the script that builds the image and pushes it to the ECR The only change that we need to do is add a line that gives the permission to the container to play with the preprocessing py script Otherwise Docker may not have the permission to open and look at its contents Create ECR repository and push docker image chmod x docker preprocessing py This line gives read and write access to the preprocessing script docker build t ecr repository docker This builds the image aws ecr get login region region registry ids account id no include email Logs in to AWS aws ecr create repository repository name ecr repository Creates ECR Repository docker tag ecr repository tag processing repository uri Tags the image to differentiate it from other images docker push processing repository uri Pushes image to ECROnce this step is done we will be ready to run our Boto processing job Creating a Processing JobIn a nutshell we need information about four sections to create a processing job using Boto •Input data information ProcessingInput •Output data information ProcessingOutput •Resource information ProcessingResources •Container information AppSpecification As you can see in the following code all the previous information is provided The code is again similar to the code we saw in the previous section it is just that Boto needs information that should be manually put inside it as parameters while when we run the code from inside SageMaker most of the information is automatically extracted response client create processing job Initialize the method ProcessingInputs InputName Training Input Give Input Job a name SInput SUri input data URL from where the data needs to be taken LocalPath opt ml processing input Local directory where the data will be downloaded SDataType SPrefix What kind of Data is it SInputMode File Is it a file or a continuous stream of data ProcessingOutputConfig Outputs OutputName Training Giving Output Name SOutput SUri s slytherins test Where the output needs to be stored LocalPath opt ml processing train Local directory where output needs to be searched SUploadMode EndOfJob Upload is done when the job finishes OutputName Testing SOutput SUri s slytherins test LocalPath opt ml processing test SUploadMode EndOfJob ProcessingJobName preprocessing job test Giving a name to the entire job It should be unique ProcessingResources ClusterConfig InstanceCount How many instances are required InstanceType ml m xlarge What s the instance type VolumeSizeInGB What should be the instance size AppSp ImageUri dkr ecr us east amazonaws com sagemaker processing container latest Docker Image URL ContainerEntrypoint Python preprocessing py How to run the script RoleArn arn aws iam role sagemaker full accss IAM role definition The previous code will start the processing job But you will not see any output To know the status of the job you can use CloudWatch which I will talk about in the next section For now we will get help from the Boto method describe processing job to get the information We can do this by writing the following code client describe processing job ProcessingJobName processing job test This will give us detailed information about the job You will find the key ProcessingJobStatus which tells about the status and if the job fails you will get a reason for the failure key as well So now we have seen the three ways of data processing provided by SageMaker References Book Practical Machine Learning in AWS Part	2021-06-12 07:10:13
金融	ニュース - 保険市場TIMES	三井住友海上、高齢者向けに行方不明時の捜索費用補償販売開始	https://www.hokende.com/news/blog/entry/2021/06/12/170000	三井住友海上、高齢者向けに行方不明時の捜索費用補償販売開始行方不明者の増加三井住友海上火災保険株式会社は月日、歳以上の高齢者が加入できる傷害保険を対象に行方不明時の捜索費用補償特約を販売開始することを発表した。	2021-06-12 17:00:00
海外ニュース	Japan Times latest articles	Japan’s older workers have fewer friends but more motivated to work than peers abroad	https://www.japantimes.co.jp/news/2021/06/12/national/social-issues/older-people-friends-work-motivation/	Japan s older workers have fewer friends but more motivated to work than peers abroadWhen asked whether they had close friends other than their family members of Japanese respondents said they did not as the pandemic curtails social	2021-06-12 16:36:45
ニュース	BBC News - Home	Euro 2020 and Covid: How can I watch with my friends?	https://www.bbc.co.uk/news/uk-57386719	covid	2021-06-12 07:38:11
北海道	北海道新聞	釧路管内で２人感染　新型コロナ	https://www.hokkaido-np.co.jp/article/554812/	新型コロナウイルス	2021-06-12 16:09:00
北海道	北海道新聞	「歩く宝石」に興味津々　釧路市立博物館が昆虫観察会	https://www.hokkaido-np.co.jp/article/554811/	興味津々	2021-06-12 16:08:00
IT	週刊アスキー	大戸屋「梅しそ巻きのチキンカツ」など“梅”を使用した夏限定メニュー	https://weekly.ascii.jp/elem/000/004/058/4058733/	限定メニュー	2021-06-12 16:30:00

このブログを検索

IT音痴アラフィフおやじのストック記事倉庫

投稿時間:2021-06-12 17:27:05 RSSフィード2021-06-12 17:00 分まとめ(28件)

コメント

コメントを投稿

このブログの人気の投稿

投稿時間:2021-06-17 22:08:45 RSSフィード2021-06-17 22:00 分まとめ(2089件)

投稿時間:2021-06-20 02:06:12 RSSフィード2021-06-20 02:00 分まとめ(3871件)

投稿時間:2021-06-17 05:05:34 RSSフィード2021-06-17 05:00 分まとめ(1274件)