投稿時間:2020-10-03 01:29:48 RSSフィード2020-10-03 01:00 分まとめ(36件)

カテゴリー等 サイト名等 記事タイトル・トレンドワード等 リンクURL 頻出ワード・要約等/検索ボリューム 登録日
AWS AWS The Internet of Things Blog Creating Object Recognition with Espressif ESP32 https://aws.amazon.com/blogs/iot/creating-object-recognition-with-espressif-esp32/ Creating Object Recognition with Espressif ESPBy using low cost embedded devices like the Espressif ESP family and the breadth of AWS services you can create an advanced object recognition system ESP microcontroller is a highly integrated solution for Wi Fi and Bluetooth IoT applications with around external components In this example we use AI Thinker ESP CAM variant that comes with an … 2020-10-02 15:22:53
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) OpenCVでfor文を使用し画像を拡大する。 https://teratail.com/questions/295622?rss=all OpenCVでfor文を使用し画像を拡大する。 2020-10-03 00:55:17
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) getServerSideProps内でローカルAPIサーバーへfetchするとNotFoundになる https://teratail.com/questions/295621?rss=all getServerSideProps内でローカルAPIサーバーへfetchするとNotFoundになるNextjsのgetServerSideProps内で、ポートに建てたGoのバックエンドAPIからデータを取得しようとするとnbspNotFoundになってしまいます。 2020-10-03 00:48:59
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) git commit -amのオプションについて https://teratail.com/questions/295620?rss=all Detail Nothing 2020-10-03 00:41:11
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) Visual Studio 2013でCrypto++を使う https://teratail.com/questions/295619?rss=all VisualStudioでCryptoを使う名前の後にnbspaposnbspaposnbspを付けることができるのはクラス名または名前空間名だけです私はCでファイルの暗号化復号化の勉強をしています。 2020-10-03 00:29:14
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) mplfinance での直線の描画 https://teratail.com/questions/295618?rss=all mplfinanceでの直線の描画mplfinanceがアップデートされ、以下のDataframe形式であれば簡単にローソク足チャートを描画できるようになりました。 2020-10-03 00:24:52
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) ReactNativeを用いたアプリ作成 https://teratail.com/questions/295617?rss=all reactnative 2020-10-03 00:21:22
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) 大富豪においてビット演算を用いた階段計算がわからない https://teratail.com/questions/295616?rss=all 大富豪においてビット演算を用いた階段計算がわからない環境UnityC以下サイト内のビット演算を用いて、大富豪における階段系演算を把握したいと思っております。 2020-10-03 00:06:23
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) Flutterで取得したいデータが、要望より少なかった時の挙動 https://teratail.com/questions/295615?rss=all Flutterで取得したいデータが、要望より少なかった時の挙動Flutterで、つの画像データを取得したいとき、フロント画面ではつの箱を灰色グレーアウトで用意しておいて、取得後、箱に画像をレンダリングすると言った動きを作りたいです。 2020-10-03 00:00:39
Program [全てのタグ]の新着質問一覧|teratail(テラテイル) Photoshopのレイヤーマスクへの二重効果について https://teratail.com/questions/295614?rss=all Photoshopのレイヤーマスクへの二重効果についてphotoshopについて、わからないところがあります。 2020-10-03 00:00:38
AWS AWSタグが付けられた新着投稿 - Qiita 【SOA対策】CloudWatch https://qiita.com/Kouichi_Itagaki/items/9043e65205552d8200c3 様々なAWSリソースのログをCloudWatchLogsダッシュボードに集約出来、ログの分析、メトリクスフィルターを使用して状態の監視を行い、特定の状態でAlarmを発砲してエラーを検知出来るサービス。 2020-10-03 00:09:53
Docker dockerタグが付けられた新着投稿 - Qiita Tensorflow2.3のkerasで再現性を確保する https://qiita.com/temple1026/items/05546696f5dc9828e270 Tensorflowのkerasで再現性を確保するはじめにTensorFlowでモデルの学習をするとき以下の記事のようにSeedを固定することで再現性を保つ方法があるようなのですがTensorFlowの環境ではseedの固定だけだと毎回同じ結果になりませんでしたTensorFlowxtfkeras乱数シードを固定して再現性を向上kerasで再現性の担保現在tensorflowtfkerasでGPU計算の再現性を確保する方法のメモtensorflowで同じコードなのに結果が異なる。 2020-10-03 00:53:27
技術ブログ Developers.IO 【レポート】CUS-86:商取引の活性化は与信システムの革命が促進する。ネットプロテクションズが挑む Amazon ECS と Amazon SageMaker を用いた決済・金融のマイクロサービス化戦略について #AWSSummit https://dev.classmethod.jp/articles/aws-summit-online-2020-session-report-cus-86/ 【レポート】CUS商取引の活性化は与信システムの革命が促進する。 2020-10-02 15:32:00
海外TECH Ars Technica Nearly 20,000 workers have had COVID-19, Amazon admits https://arstechnica.com/?p=1711245 amazon 2020-10-02 15:53:38
海外TECH Ars Technica SpaceX, Northrop seek to break launch gremlin curse with Friday night attempts https://arstechnica.com/?p=1711240 attemptsspacex 2020-10-02 15:35:30
Apple AppleInsider - Frontpage News How to use Picture in Picture in tvOS 14 https://appleinsider.com/articles/20/10/02/how-to-use-picture-in-picture-in-tvos-14 How to use Picture in Picture in tvOS Apple TV s updated Picture in Picture feature is useful but it doesn t work with everything you want to watch ーand it s a little awkward to use With the updated Picture in Picture feature you can stream security cameras or watch two videos at onceOnce your Apple TV has the new tvOS or later you can send what you re watching into a corner of the screen It carries on playing as you start an Apple Arcade game as you search the App Store for something or when you are just looking for anything better to watch Read more 2020-10-02 15:35:38
Apple AppleInsider - Frontpage News Today only: 8-core 15-inch MacBook Pro drops to $1,849 ($950 off) https://appleinsider.com/articles/20/10/02/today-only-8-core-15-inch-macbook-pro-drops-to-1849-950-off Today only core inch MacBook Pro drops to off Amazon owned Woot s latest flash deal offers substantial savings on Apple s inch MacBook Pro that s equipped with a Core i processor GB SSD and upgraded graphics Flash MacBook Pro dealThe daily deal offers shoppers off Apple s Mid inch MacBook Pro bringing the price down to These units are refurbished by Apple but come with a year Woot warranty in lieu of an Apple warranty and are packaged in a generic white box Read more 2020-10-02 15:25:21
Apple AppleInsider - Frontpage News Apple TV+ review: 'Tiny World' gets back to nature, with Paul Rudd narrating https://appleinsider.com/articles/20/10/02/apple-tv-review-tiny-world-gets-back-to-nature-with-paul-rudd-narrating Apple TV review x Tiny World x gets back to nature with Paul Rudd narrating Ant Man actor Paul Rudd hosts a beautifully rendered take on the smaller side of the animal kingdom in Tiny World Paul Rudd narrates Tiny World premiering Friday October exclusively on Apple TV Tiny World somewhat improbably is not the first Apple TV original project to feature an extreme closeup of a small dung beetle diving at a fresh piece of elephant dung This also happened in The Elephant Queen the documentary that debuted on the service around the time of its launch late last year Read more 2020-10-02 15:55:35
Apple AppleInsider - Frontpage News New 5G iPhone SE with dual-lens camera in 2022, ProMotion in 'iPhone 13' display analyst says https://appleinsider.com/articles/20/10/02/new-5g-iphone-se-with-dual-lens-camera-in-2022-promotion-in-iphone-13-display-analyst-says New G iPhone SE with dual lens camera in ProMotion in x iPhone x display analyst saysA new rumor chimes in on ProMotion arriving with the iPhone and Apple may not release new iPhone SE until but it will arrive with G support a dual camera setup and a larger display when it does Credit Andrew O Hara AppleInsiderAccording to display expert Ross Young Apple won t release a new iPhone SE model in the spring of Instead a successor to the low cost iPhone arrives in the spring of Read more 2020-10-02 15:54:47
海外TECH Engadget Amazon Music HD is adding thousands more Ultra HD songs and albums https://www.engadget.com/amazon-universal-warner-music-group-remaster-hd-songs-155318511.html Amazon Music HD is adding thousands more Ultra HD songs and albumsAmazon introduced its high res music streaming tier Amazon Music HD last fall Now it says the service is about to get a whole lot better Amazon Music is teaming up with Universal Music Group and Warner Music Group to remaster thousands of songs 2020-10-02 15:53:18
海外TECH Engadget Apple Watch Series 6 review: The best new features are the boring ones https://www.engadget.com/apple-watch-series-6-review-153047133.html Apple Watch Series review The best new features are the boring onesThis fall marks the fifth anniversary of the original Apple Watch Other than the basic design itself a square display with a digital crown and mostly familiar lineup of wrist straps a lot has changed Gone is the solid gold edition tha 2020-10-02 15:30:56
海外TECH Engadget 'Fall Guys' season 2 begins October 8th https://www.engadget.com/fall-guys-season-2-launch-date-october-8-150844239.html x Fall Guys x season begins October thGood news for fans of the wildly popular game Fall Guys The game s second season will be launching on October th according to its Twitter account That means those who have been waiting for more rounds have less than a week left to wait The game 2020-10-02 15:08:44
Cisco Cisco Blog Cisco Named a Leader in Aragon Research Globe for Team Collaboration 2020 https://blogs.cisco.com/collaboration/cisco-named-a-leader-in-aragon-research-globe-for-team-collaboration-2020 Cisco Named a Leader in Aragon Research Globe for Team Collaboration This week industry analyst firm Aragon Research published their annual Aragon Research Globe for Team Collaboration and I am thrilled to announce that Cisco has again been identified as a Leader The post Cisco Named a Leader in Aragon Research Globe for Team Collaboration appeared first on Cisco Blogs 2020-10-02 15:51:40
Cisco Cisco Blog Disruption Leads to Innovation: Cisco at NVIDIA GTC https://blogs.cisco.com/partner/disruption-leads-to-innovation-cisco-at-nvidia-gtc Disruption Leads to Innovation Cisco at NVIDIA GTCInnovation has emerged front and center as the means to cope with epic change One example is the Virtual Workstation solution with Cisco UCS and NVIDIA GPUs These Virtual Workstations are giving employees the ability to work on highly complex and graphics intensive applications remotely Where do you learn the most up to date information on innovation The answer is NVIDIA s GPU Technology Conference The post Disruption Leads to Innovation Cisco at NVIDIA GTC appeared first on Cisco Blogs 2020-10-02 15:00:48
海外TECH CodeProject Latest Articles Building a Database Application in Blazor - Part 2 - Services - Building the CRUD Data Layers https://www.codeproject.com/Articles/5279596/Building-a-Database-Application-in-Blazor-Part-2-S application 2020-10-02 15:55:00
海外TECH CodeProject Latest Articles A Beginner's Tutorial for Understanding and Implementing a CRUD APP using Elasticsearch and C# - Part 2 https://www.codeproject.com/Articles/1033116/A-Beginners-Tutorial-for-Understanding-and-Imple-3 integration 2020-10-02 15:55:00
海外TECH WIRED I'm Done Being Mistaken for Jeff Bezos and MacKenzie Scott https://www.wired.com/story/done-being-mistaken-jeff-bezos-mackenzie-scott address 2020-10-02 15:56:30
海外ニュース Japan Times latest articles Brazil’s Amazon sees nearly two-thirds more fires than last September https://www.japantimes.co.jp/news/2020/10/02/world/brazil-amazon-two-thirds-more-fires/ Brazil s Amazon sees nearly two thirds more fires than last SeptemberSatellites used by the National Institute of Space Research detected outbreaks last month in the Amazon compared to in the same month in 2020-10-03 00:37:57
ニュース BBC News - Home Trump Covid: US president has mild symptoms - White House https://www.bbc.co.uk/news/world-us-canada-54391986 president 2020-10-02 15:24:54
ニュース BBC News - Home Covid: Growth in Covid cases 'may be levelling off' https://www.bbc.co.uk/news/health-54387057 previous 2020-10-02 15:38:16
ニュース BBC News - Home Brexit: EU calls for trade talks to 'intensify' ahead of call with UK https://www.bbc.co.uk/news/uk-54384437 boris 2020-10-02 15:16:45
ニュース BBC News - Home London Marathon: Kenenisa Bekele to miss race because of injury https://www.bbc.co.uk/sport/athletics/54386018 London Marathon Kenenisa Bekele to miss race because of injuryKenenisa Bekele s much anticipated London Marathon duel with Eliud Kipchoge is off as the Ethiopian pulls out of Sunday s race with a calf injury 2020-10-02 15:16:39
北海道 北海道新聞 桐生が10秒27で2度目の優勝 陸上、日本選手権第2日 https://www.hokkaido-np.co.jp/article/466702/ 日本選手権 2020-10-03 00:13:24
北海道 北海道新聞 レバンガ 3日名古屋Dと開幕戦 https://www.hokkaido-np.co.jp/article/466711/ 開幕戦 2020-10-03 00:03:04
GCP Cloud Blog Gauge the effectiveness of your DevOps organization running in Google Cloud https://cloud.google.com/blog/products/devops-sre/another-way-to-gauge-your-devops-performance-according-to-dora/ Gauge the effectiveness of your DevOps organization running in Google CloudEditor s note There are many ways to skin the DevOps cat Google Cloud Developer Programs Engineer Dina Graves Portman recently wrote about how to evaluate your DevOps effectiveness using the open source Four Keys project Here Google Customer Engineer Brian Kaufman shows you how to do the same thing but for an application that runs entirely on Google Cloud Many organizations aspire to become true high functioning DevOps shops but it can be hard to know where you stand According to DevOps Research and Assessment or DORA you can prioritize just four metrics to measure the effectiveness of your DevOps organizationーtwo to measure speed and two to measure stability Speed Lead Time for Changes Code commit to code in production Deployment Frequency How often you push codeStability Change Failure Rate Rate of deployment failures in production that require immediate remedy Rollback or manual change Time to Restore Service MTTR Mean time to recovery  In this post we present a methodology to collect these four metrics from software delivery pipelines and applications deployed in Google Cloud You can then use those metrics to rate your overall practice effectiveness and baseline your organization s performance against DORA industry benchmarks and determine whether you re an Elite High Medium or Low performer Click to enlargeLet s take a look at how to do this in practice with a sample architecture running on Google Cloud  Services and reference architectureTo get started we create a CI CD pipeline with the following cloud services  Github Code RepoCloud Build a container based CI CD Tool Container Registry Google Kubernetes Engine GKE Cloud Load Balancing used as an Ingress Controller for GKE Cloud Uptime Checks for synthetic application monitoringCloud MonitoringCloud FunctionsPub Sub used as a message bus to connect Alerts to Cloud Functions These are combined into the reference architecture below Note that all of these Google Cloud services are integrated with Cloud Monitoring As such there s nothing in particular that you need to set up to receive service logs and many of these services have built in metrics that we ll use in this post Google Cloud Platform CI CD pipeline and application topologyMeasuring SpeedTo measure our two speed metricsーdeployment frequency and lead time to commitーwe instrument Cloud Build which is a continuous integration and continuous delivery tool As a container based CI CD tool Cloud Build lets you load a series of Google managed orcommunity managed Cloud Buildersto manipulate your code or interact with internal and external services during the build deployment process Upon firing a build trigger Cloud Build reaches into our Git Repository for our source code creates a container image artifact that it pushes to the container registry and then deploys the container image to a GKE cluster You can also import your own cloud builder container in the process and insert it as the final build step to determine the time from commit to deployment as well as whether this is a rollback deployment For this example we ve created a custom container to be used as the last build step that Retrieves the payload binding for the commit timestamp accessed by the variable push repository pushed at and compares it against the current timestamp to calculate lead time The payload binding variable is used when we create the trigger and is referenced by a custom variable MERGE TIME in cloudbould yaml  Reaches into the source repo to get the commit ID of the latest commit on the master branch and compares it to the current commit ID of the build to determine if it is a rollback or a match  You can find a reference Cloud Build config yaml here that shows each build step described above If you re using a non built in variable like MERGE TIME payload binding in your config file you need to specify the variable map when you setup the cloud build trigger to the push repository pushed at value You can find the custom cloud builder container used here After the build step for this container runs the following is outputted to the Cloud Build logs which are fed automatically into Cloud Monitoring Notice the commit ID Rollback value and LeadTime values which are written to the logs from our custom cloud builder Next we can create a log based metric in Cloud Logging to absorb these custom values Log based metrics can be based on filters for specific log entries Once we have our specific log entries filter we can use regular expressions assigned to a particular piece of the output logs to capture specific sections of the log entry into metrics In the screenshots below we created labels for the commit name and rollback value that will attach to the LeadTime value that shows up in the textPayload field of our log We use the following regular expressions Metric Value Create log based metric and labelsLead Time for ChangesOnce we have the above metric and labels created from our Cloud Build log we can access it in Cloud Operations Metrics explorer via the metric label logging user dorametics DoraMetrics was the name we gave our log based metric The value of the metric will be the LeadTime as extracted from the regular expression above with Rollbacks filtered out We use the median or th percentile Deployment FrequencyNow that we have the lead time for each commit we can determine the frequency of deployments by just counting the number of lead times we recorded in a window Measuring stability Change Failure CountTo determine the number of software rollbacks that were performed we can look at our Deployment Frequency and filter for Rollback True metrics This gives us a count of the total rollbacks performed If we wanted to determine the Change Failure Rate we would use data collected in this chart and divide it by the Deployment Frequency metric collected above for the same window Mean Time To Resolution MTTR In typical enterprise environments there are incident response systems that allow you to determine when an issue was reported and when it is ultimately resolved Assuming these times could be queried MTTR could be determined by the average time between the reported and resolved timestamps of the issues  In this blog we use automation to alert and graph issues which allows us to gather more accurate service disruption metrics Our strategy involves the use of Service Level Objectives SLO which represents Service Level Indicators SLI  that we ve determined represent our customers happiness with our application and an objective When we violate an SLO we consider our mean time to restore service is the total time it takes to detect mitigate and resolve a problem until we are back in compliance with the SLO MTTR and customer satisfactionFor the purposes of simplicity we ve highlighted one metric we feel represents our customer satisfaction overall HTTP response code errors from our website The ratio of this metric against the total response codes sent over a given time window constitutes our Service Level Indicator SLI  For total errors we monitor response codes returned from our front end load balancer which is set up as an ingress controller in our GKE cluster Metric Used loadbalancing googleapis com https request count Group by response codeUsing this metric above we can build our SLI and wrap it into an SLO that represents the customer satisfaction observed over a longer time window Using the SLO API we create custom SLOs that represent the level of customer satisfaction we want to monitor where being in violation of that SLO indicates an issue There s a great tutorial on how to create custom SLOs and services here  In this example we ve created a custom service to represent our application and an SLO for HTTP LB response codes code It assumes a quality of service level in which of responses from the load balancer should not be errors in a given day Doing this automatically creates an error budget of over hours Now when it comes to monitoring for MTTR we have a metric SLI that s attached to a service level SLO that represents quality of service over a given window of time The failure of the SLO is simulated in the screenshot below Next we set up an alert policy that fires when we are in danger of violating this SLO This also starts a timer to calculate the time to resolution What we re measuring here is referred to as burn rate ーhow much of our error budget of errors over hours we are eating up with the current SLI metic The window we measure for our alert is much smaller than our entire SLO so when the SLI has moved back within compliance of a threshold another alert fires indicating the incident has cleared For more information on setting up alerting policies please visit this page  You can also send out alerts through a variety of channels allowing you to integrate into existing ticketing or messaging systems to record the MTTR in a way that makes sense for your organization For our purposes we integrate with the Pub Sub message bus channel sending the alerts to a cloud function that performs the necessary charting calculators In the message from the clearing alert we see the JSON payload has the started at and ended at timestamps We use these timestamps in our cloud function to calculate the time to resolve the issue and then output it to the logs Here is the entire Pub Sub message sent to Cloud Functions Here is the cloud function connected to the same Pub Sub topic as the Alert The results in the following messages sent to Cloud Functions logs The final step is to create another log based metric to pick up the Time to Resolve value that we print to our cloud functions log We do so with this regex expression Resolve s Now the metric is available in Cloud Operations ConclusionWe ve shown above how you can create custom cloud builders in Cloud Build to generate metrics relating to deployment frequency mean time to deployment and rollback that will appear in Cloud Operations logs We ve also shown you how to use SLOs and SLIs to generate and push alerts to your Cloud Functions logs We ve used log based metrics to pull our metrics out of the logs and chart them These metrics can be used to evaluate the effectiveness of your organization s software development and delivery pipelines over time as well as help you evaluate your performance amongst the greater DevOps community Where does your organization land  For more inspiration here is some further reference material to help you measure the effectiveness of your own DevOps organization Google Cloud Application Modernization Program blog Setting SLOs a step by step guide blog Setting SLOs observability using custom metrics blog Concepts in Service Monitoring documentation Working with the SLO API documentation How to create SLOs in the GCP Console video How to create SLOs at scale with the SLO API video How to create SLOs using custom metrics video GitHub SLO API Code used for BlogDORA Quick CheckThe Keys Project for DORA Metric Ingression into BigQuery new ways we re improving observability with Cloud Ops blog Related ArticleAre you an Elite DevOps performer Find out with the Four Keys ProjectLearn how the Four Keys open source project lets you gauge your DevOps performance according to DORA metrics Read Article 2020-10-02 16:00:00
GCP Cloud Blog Toward automated tagging: bringing bulk metadata into Data Catalog https://cloud.google.com/blog/products/data-analytics/best-practices-for-bulk-ingestion-of-metadata-to-cloud-data-catalog/ Toward automated tagging bringing bulk metadata into Data CatalogData Catalog lets you ingest and edit business metadata through an interactive interface It includes programmatic interfaces that can be used to automate your common tasks Many enterprises have to define and collect a set of metadata using Data Catalog so we ll offer some best practices here on how to declare create and maintain this metadata in the long run  In our previous post we looked at how tag templates can facilitate data discovery governance and quality control by describing a vocabulary for categorizing data assets In this post we ll explore how to tag data using tag templates Tagging refers to creating an instance of a tag template and assigning values to the fields of the template in order to classify a specific data asset As of this writing Data Catalog supports three storage back ends BigQuery Cloud Storage and Pub Sub We ll focus here on tagging assets that are stored on those back ends such as tables columns files and message topics  We ll describe three usage models that are suitable for tagging data within a data lake and data warehouse environment provisioning of a new data source processing derived data and updating tags and templates For each scenario you ll see our suggested approach for tagging data at scale    Provisioning data sourcesProvisioning a data source typically entails several activities creating tables or files depending on the storage back end populating them with some initial data and setting access permissions on those resources We add one more activity to this list tagging the newly created resources in Data Catalog Here s what that step entails  Tagging a data source requires a domain expert who understands both the meaning of the tag templates to be used and the semantics of the data in the data source Based on their knowledge the domain expert chooses which templates to attach as well as what type of tag to create from those templates It is important for a human to be in the loop given that many decisions rely on the accuracy of the tags  We ve observed two types of tags based on our work with clients One type is referred to as static because the field values are known ahead of time and are expected to change only infrequently The other type is referred to as dynamic because the field values change on a regular basis based on the contents of the underlying data An example of a static tag is the collection of data governance fields that include data domain data confidentiality and data retention The value of those fields are determined by an organization s data usage policies They are typically known by the time the data source is created and they do not change frequently An example of a dynamic tag is the collection of data quality fields such as number values unique values min value and max value Those field values are expected to change frequently whenever a new load runs or modifications are made to the data source  In addition to these differences static tags also have a cascade property that indicates how their fields should be propagated from source to derivative data We ll expand on this concept in a later section By contrast dynamic tags have a query expression and a refresh property to indicate the query that should be used to calculate the field values and the frequency by which they should be recalculated An example of a config for a static tag is shown in the first code snippet and one for a dynamic tag is shown in the second YAML based static tag configYAML based dynamic tag configAs mentioned earlier a domain expert provides the inputs to those configs when they are setting up the tagging for the data source More specifically they first select the templates to attach to the data source Secondly they choose the tag type to use namely static or dynamic Thirdly they input the values of each field and their cascade setting if the type is static or the query expression and refresh setting if the type is dynamic These inputs are provided through a UI so that the domain expert doesn t need to write raw YAML files   Once the YAML files are generated a tool parses the configs and creates the actual tags in Data Catalog based on the specifications The tool also schedules the recalculation of dynamic tags according to the refresh settings While a domain expert is needed for the initial inputs the actual tagging tasks can be completely automated We recommend following this approach so that newly created data sources are not only tagged upon launch but tags are maintained over time without the need for manual labor   Processing derivative dataIn addition to tagging data sources it s important to be able to tag derivative data at scale We define derivative data in broad terms as any piece of data that is created from a transformation of one or more data sources This type of data is particularly prevalent in data lake and warehousing scenarios where data products are routinely derived from various data sources  The tags for derivative data should consist of the origin data sources and the transformation types applied to the data The origin data sources URIs are stored in the tag and one or more transformation types are stored in the tagーnamely aggregation anonymization normalization etc We recommend baking the tag creation logic into the pipeline that generates the derived data This is doable with Airflow DAGs and Beam pipelines For example if a data pipeline is joining two data sources aggregating the results and storing them into a table you can create a tag on the result table with references to the two origin data sources and aggregation true You can see this code snippet of a Beam pipeline that creates such a tag Beam pipeline with tagging logicOnce you ve tagged derivative data with its origin data sources you can use this information to propagate the static tags that are attached to those origin data sources This is where the cascade property comes into play which indicates which fields should be propagated to their derivative data An example of the cascade property is shown in the first code snippet above where the data domain and data confidentiality fields are both to be propagated whereas the data retention field is not This means that any derived tables in BigQuery will be tagged with data domain HR and data confidentiality CONFIDENTIAL using the dg template   Handling updatesThere are several scenarios that require update capabilities for both tags and templates For example if a business analyst discovers an error in a tag one or more values need to be corrected If a new data usage policy gets adopted new fields may need to be added to a template and existing fields renamed or removed  We provide configs for tag and template updates as shown in the figures below The tag update config specifies the current and new values for each field that is changing The tool processes the config and updates the values of the fields in the tag based on the specification If the updated tag is static the tool also propagates the changes to the same tags on derivative data  The template update config specifies the field name field type and any enum value changes The tool processes the update by first determining the nature of the changes As of this writing Data Catalog supports field additions and deletions to templates as well as enum value additions but field renamings or type changes are not yet supported As a result the tool modifies the existing template if a simple addition or deletion is requested Otherwise it has to recreate the entire template and all of its dependent tags YAML based tag update configYAML based template update config We ve started prototyping these approaches to release an open source tool that automates many tasks involved in creating and maintaining tags in Data Catalog in accordance with our proposed usage model Keep an eye out for that In the meantime learn more about Data Catalog tagging 2020-10-02 16:00:00

コメント

このブログの人気の投稿

投稿時間:2021-06-17 05:05:34 RSSフィード2021-06-17 05:00 分まとめ(1274件)

投稿時間:2021-06-20 02:06:12 RSSフィード2021-06-20 02:00 分まとめ(3871件)

投稿時間:2020-12-01 09:41:49 RSSフィード2020-12-01 09:00 分まとめ(69件)