投稿時間:2023-02-05 08:05:19 RSSフィード2023-02-05 08:00 分まとめ(7件)
カテゴリー等 | サイト名等 | 記事タイトル・トレンドワード等 | リンクURL | 頻出ワード・要約等/検索ボリューム | 登録日 |
---|---|---|---|---|---|
IT | ビジネス+IT 最新ニュース | サッカー世界4位の「超強豪」ベルギー、日本が学ぶべき驚きの人材育成術とは | https://www.sbbit.jp/movie/13040?ref=rss | 2023-02-05 07:10:00 | |
IT | ビジネス+IT 最新ニュース | 世界4位「サッカー超強豪国」ベルギーの人材育成はどこがスゴいのか? | https://www.sbbit.jp/article/cont1/104938?ref=rss | 一部リーグ | 2023-02-05 07:10:00 |
海外TECH | DEV Community | Centering Elements in CSS with Tailwind | https://dev.to/smpnjn/centering-elements-in-css-with-tailwind-4cm5 | Centering Elements in CSS with TailwindTailwind is an increasingly popular and widely used framework which makes it easier in some cases to build and ship features and products In this guide we ll be covering how you can vertically and horizontally center an element or text with tailwind In another article I ve already covered how to center an HTML element both horizontally and vertically using CSS If you want to learn about that click here Horizontally and Vertically center an element with TailwindHorizontally and vertically centering something with Tailwind is easy with flexbox All we have to do is three things use h screen to make the element the height of the screenuse flex to make the element a flexboxuse items center to vertically center ituse justify center to center it horizontallyOur code then looks like this lt div class h screen flex items center justify center gt Horizontally and Vertically Centered Element lt div gt Which produces a result like this Vertically centering an element with TailwindThe easiest way to vertically center something is to do what we did before and use flexbox This time though we ll remove justify center The item will then be centered only along the vertical axis lt div class h screen flex items center gt Vertically Centered Element lt div gt You can see an example of this in the codepen below Horizontally centering an element with TailwindIn the same way that we vertically centered the div above we can also horizontally center using our previous approach The code for that looks like this lt div class flex justify center items center gt Horizontally Centered Element lt div gt This is the same code we used for horizontally and vertically centering just without h screen Here s the result ConclusionCentering elements with flexbox is easy in Tailwind In this guide we ve learned how to vertically and horizontally center elements including HTML elements and text which is infinitely useful in web development To learn more about CSS in general you can find my other content here | 2023-02-04 22:45:51 |
海外TECH | DEV Community | Learn how to scrape Trustpilot reviews using Go | https://dev.to/boodyvo/learn-how-to-scrape-trustpilot-reviews-using-go-2glo | Learn how to scrape Trustpilot reviews using Go OverviewWeb scraping is a powerful tool for extracting information from websites With web scraping you can gather data from websites and use it for a variety of purposes such as data analysis machine learning and research The process of web scraping involves making HTTP requests to a website parsing the HTML content and extracting the desired information Go is a popular programming language that is well suited for web scraping due to its efficiency scalability and built in libraries In this article we will explore the process of web scraping using Go and learn the best programming practices for extracting information from websites Whether you are a seasoned developer or a beginner this article will provide you with the knowledge and resources you need to get started with web scraping using Go So let s dive in and learn how to extract information from websites with Go If you don t want to read more similar tutorials subscribe to the newsletter in TheDevBook Now let s take a look at scraping Trustpilot page overview for scrapingBefore we start writing code to extract information from a website it is important to understand the structure of the page we are going to parse The HTML source code of a website defines the structure layout and content of the page To inspect the HTML source code of a website right click anywhere on the page and select Inspect both Chrome and Firefox This will open development tools with all the HTML code that makes up the page Once we have the HTML source code we need to identify the information we want to extract The basics of the HTML is not a goal of the article but for more information there are good explanation from Mozilla or WSchools From the page structure we can see that each review that card is a div element with classes styles cardWrapper LcCPA and styles reviewCard HxJJ That means we can extract the card by using a queries like jQuery to extract the particular block After that we can extract the text that contains in the block and save it as a review data Generally by understanding the HTML structure and using HTML selectors we can extract the information we want from a website In the next sections we will learn how to write Go code to make HTTP requests parse HTML and extract data Scraping tools in GoThere are several different libraries that are commonly used for web scraping in Go net http standard library in Go that provides the ability to make HTTP requests and receive HTTP responses It is a low level library which means that it requires a bit more code to use than some other libraries but it is also very flexible and can be used for any type of HTTP request github com gocolly colly popular and widely used library for web scraping in Go It provides a higher level API than net http and makes it easier to extract information from websites It also provides features such as concurrency automatic request retries and support for cookies and sessions github com PuerkitoBio goquery library that provides a convenient and concise way to query HTML and XML documents It provides a jQuery like API for selecting elements and extracting data making it a popular choice for web scraping in Go github com yhat scrape library that provides a convenient and flexible way to extract information from websites It provides a high level API that makes it easy to extract specific elements from a web page and it also supports custom extractors for more complex scraping tasks In the tutorial we are going to use simple net http for requesting pages and github com PuerkitoBio goquery to make queries for the HTML document Scraping the pageLet s go top down through the code idea First we need to identify the review internal structure We checked that it should contain fields like text date title etc so let s create a Go representation of the Review and all product Reviews structure type Review struct Text string json text Date string json date Rating string json rating Title string json title Link string json link type ProductReviews struct ProductName string json product name Reviews Review json reviews It is JSON serializable using go json tags For the purpose of the article let s get the particular product invideo io and extract it with URL constant as a separate section const scrapingURL scrapingPageURL productName invideo io Now let s write a main function with a general idea of what we need func main log Printf Start scraping reviews for s productName productReviews err getProductReviews productName if err nil log Fatal err jsonFile err os Create fmt Sprintf trustpilot reviews s json productName if err nil log Fatal err defer jsonFile Close jsonEncoder json NewEncoder jsonFile err jsonEncoder Encode productReviews if err nil log Fatal err log Printf Successfully scraped d reviews for s len productReviews Reviews productName We get all the reviews using getProductReviews function and after that save them into a JSON file Now we need to create a function to extract all reviews for a particular product First we need to get the HTLM page and transform it into a goquery document productURL fmt Sprintf scrapingURL name make a request to the product pageres err http Get productURL if err nil return nil err defer res Body Close transform the HTML document into a goquery document which will allow us to use a jquery like syntaxdoc err goquery NewDocumentFromReader res Body if err nil return nil err Note Trustpilot doesn t block simple requests at the moment of writing the article but many big services will block such requests as it s definitely done not via browser user In such a case we need to use another approach to extracting page URLs like using scraping tools or headless browsers and proxies After extracting the document we can need to prepare a reviews slice where we want to store all the reviews As we are going to make requests to Trustpilot in parallel for different pages and slice if not concurrency safe that means we cannot simply write into it in parallel as it causes data race conditions we need a mechanism to synchronize the data Here we are going to use channels reviews make Review we synchronize reviews processing with a channel as we scrape reviews from multiple pages in parallelreviewsChan make chan Review quitChan make chan struct we append reviews in a separate goroutine from reviewsChango func for review range reviewsChan reviews append reviews review close quitChan Now we can use goquery the document we created and extract elements we are interested in particularly all div elements that contain classes with prefixes styles reviewCard and styles cardWrapper Let s extract all the div and process them to avoid one extra request we process first page here separatelydoc Find div Each extractReviewFunc reviewsChan productURL Let s dive deep into the extractReviewFunc function In the implementation we filter classes of the page element and identify if it contains necessary prefixes classes exists s Attr class if exists return validate if the div is a review card and a card wrapper to avoid processing other divs like advertisement isReviewCard falseisCardWrapper falsefor class range strings Split classes if strings HasPrefix class styles reviewCard isReviewCard true if strings HasPrefix class styles cardWrapper isCardWrapper true if isReviewCard isCardWrapper return If it is a review card we can extract all the necessary information using the element selection and send it to reviewsChan extract review datadateOfPost s Find time AttrOr datetime textOfReview s Find p data service review text typography Text title s Find h Text link s Find a data review title typography Attr href if link link productURL link we don t transform the data in place as we want to keep the original data for future analysisrating s Find img AttrOr alt reviews lt amp Review Text textOfReview Date dateOfPost Rating rating Title title Link link As doc Find div Each takes a func i int s goquery Selection as input and we use external parameters inside the function particularly reviewsChan and productURL we need to wrap such a function into another function that can return it It s a common pattern in Go when we need to create a function with a predefined signature and pass specific parameters to use inside it func extractReviewFunc reviews chan lt Review productURL string func i int s goquery Selection return func i int s goquery Selection our function code here The whole function code looks likefunc extractReviewFunc reviews chan lt Review productURL string func i int s goquery Selection return func i int s goquery Selection classes exists s Attr class if exists return validate if the div is a review card and a card wrapper to avoid processing other divs like advertisement isReviewCard false isCardWrapper false for class range strings Split classes if strings HasPrefix class styles reviewCard isReviewCard true if strings HasPrefix class styles cardWrapper isCardWrapper true if isReviewCard isCardWrapper return extract review data dateOfPost s Find time AttrOr datetime textOfReview s Find p data service review text typography Text title s Find h Text link s Find a data review title typography Attr href if link link productURL link we don t transform the data in place as we want to keep the original data for future analysis rating s Find img AttrOr alt reviews lt amp Review Text textOfReview Date dateOfPost Rating rating Title title Link link Now we need only extract the number of pages and run the same logic for other pages with reviews as they have the same HTML structure For that we need to process a pagination link element we need to find a link to last page and extract the number of pages for the productdoc Find a name pagination button last Each extractReviewsOverPagesFunc reviewsChan name For extractReviewsOverPagesFunc we use a similar approach as we did for extractReviewFunc We extract a href attribute of our a element and using regular expressions to get to the last page of the product href exists s Attr href if exists return we need to find a link to pages and extract the number of pages for the productmatch err regexp MatchString page d href if err nil match return re regexp MustCompile d lastPage re FindString href lastPageInt err strconv Atoi lastPage if err nil log Printf Cannot parse last page s s n lastPage err return When we know the number of pages we need to process we can do that in parallel using wait groups to wait while all goroutines finish scrape all pages in parallelwg amp sync WaitGroup for i i lt lastPageInt i wg Add go func pageNumber int defer wg Done pageReviews err getPageProductReviews name pageNumber if err nil log Printf Cannot get page d product reviews s pageNumber err return for review range pageReviews reviews lt review i wg Wait We start from the second page as the first one the current page from where we extracted the last page was already processed The getPageProductReviews is a similar function as getProductReviews but we don t extract a last link to get all pages func getPageProductReviews name string page int Review error log Printf Start scraping page d for s page name productURL is used to construct a link to the review It s pure without query params productURL fmt Sprintf scrapingURL name actual request URL for scraping a page productRequestURL fmt Sprintf scrapingPageURL name page res err http Get productRequestURL if err nil return nil err defer res Body Close doc err goquery NewDocumentFromReader res Body if err nil return nil err reviews make Review reviewsChan make chan Review quitChan make chan struct go func for review range reviewsChan reviews append reviews review close quitChan extract reviews from the page doc Find div Each extractReviewFunc reviewsChan productURL close reviewsChan lt quitChan return reviews nil As we can notice there is one thing we didn t mention in the function getPageProductReviews yet particularly the channels synchronization part close reviewsChan lt quitChanAfter running all goroutines we close the reviewsChan as we won t write into it anymore and wait until we process all the reviews and save them into reviews slice Basically that s it We now can extract reviews and dump them into a JSON object for future analysis The final code can be found on GitHub Let s run it and check the data After execution there is a new file trustpilot reviews invideo io json There is a bunch of data and it looks like product name invideo io reviews text excellent application a bit expensive for my economic situation but if you have the money it is an application that will help you in a simple way in editing videos date T Z rating title excellent application link ConclusionIn this article we have covered the basics of web scraping and how to parse Trustpilot reviews using Go We ve discovered how to check the HTML structure of a particular website identify the necessary elements we need to extract the data and explored some of the most popular libraries for web scraping in Go As we move forward I encourage you to continue learning and experimenting with web scraping This is a powerful and valuable skill and there are endless opportunities to apply it in your work and personal projects like lead generation trends analysis generally gathering databases for data analysis etc If you want to not miss other similar articles sign up for the newsletter in TheDevBook | 2023-02-04 22:25:37 |
海外ニュース | Japan Times latest articles | Recipe: Amazake oatmeal | https://www.japantimes.co.jp/life/2023/02/05/food/recipe-box-amazake-oatmeal-breakfast/ | alcohol | 2023-02-05 07:05:42 |
ニュース | BBC News - Home | China balloon: US shoots down airship over Atlantic | https://www.bbc.co.uk/news/world-us-canada-64524105?at_medium=RSS&at_campaign=KARANGA | america | 2023-02-04 22:31:05 |
ニュース | BBC News - Home | Davis Cup 2023: Great Britain reach Finals as Cameron Norrie seals victory over Colombia | https://www.bbc.co.uk/sport/tennis/64525522?at_medium=RSS&at_campaign=KARANGA | Davis Cup Great Britain reach Finals as Cameron Norrie seals victory over ColombiaCameron Norrie confirms Great Britain s place in September s Davis Cup Finals group stage by beating Colombia s Nicolas Mejia in straight sets | 2023-02-04 22:02:40 |
コメント
コメントを投稿