์›น ํฌ๋กค๋ง

    selenium ํ†ตํ•ด์„œ web crawling ํ•ด์„œ slack ๋ฉ”์„ธ์ง€ ๋ณด๋‚ด๊ธฐ

    1. selenium, schedule, requests ์„ค์น˜ - selenium : ์›น ๋ธŒ๋ผ์šฐ์ €๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ์ œ์–ดํ•œ๋‹ค. ๋ธŒ๋ผ์šฐ์ €์— ์ง์ ‘ ์ ‘๊ทผํ•˜๋‹ˆ ๋™์  ์›น ํŽ˜์ด์ง€๋„์— ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋‹ค. - reqeusts : ์ธํ„ฐ๋„ท์—์„œ ํŒŒ์ผ๊ณผ ์›น ํŽ˜์ด์ง€๋ฅผ ๋‹ค์šด๋กœ๋“œ ๊ฐ€๋Šฅํ•˜๋‹ค. - schedule : ํŠน์ • ์ž‘์—…์„ ์ผ์ •์— ๋งž์ถฐ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค์ • ๊ฐ€๋Šฅํ•˜๋‹ค. - beautifulsoup : ์›น ํŽ˜์ด์ง€๋ฅผ ์ž‘์„ฑํ•˜๋Š” ํ˜•์‹์ธ HTML์„ ๊ตฌ๋ฌธ ๋ถ„์„ํ•˜๋ฉฐ ์ •์  ์›น ํŽ˜์ด์ง€๋งŒ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋‹ค. (์ œ๊ฐ€ ํฌ๋กค๋ง ํ•  ์›น ํŽ˜์ด์ง€๋Š” ๋™์  ์›น ํŽ˜์ด์ง€๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— selenium์„ ์‚ฌ์šฉํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค ๐Ÿค”) pip3 install selenium pip3 install requests pip3 install schedule 2. ํŠน์ • URL์˜ HTML ..

    ์›น ํฌ๋กค๋ง(BeautifulSoup)

    1. BeautifulSoup ์„ค์น˜ - ์—ฌ๊ธฐ์„œ BeautifulSoup์€ ๋ฌด์—‡์ธ๊ฐ€? ์˜ˆ์œ ์Šพ...? HTML ๋ฐ XML ํŒŒ์ผ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ ์˜ค๊ธฐ์œ„ํ•œ Python ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค ์–ด์›์€ '์ด์ƒํ•œ ๋‚˜๋ผ์˜ ์•จ๋ฆฌ์Šค'์—์„œ ์œ ๋ž˜๋˜์—ˆ๋‹ค๊ณ  ํ•˜๊ณ  ์•„๋ฆ„๋‹ต๊ฒŒ ์ •๋ ฌํ•ด์ค€๋‹ค๋Š” ์ •๋„์˜ ์˜๋ฏธ?! Beautiful Soup Documentation — Beautiful Soup 4.9.0 documentation Non-pretty printing If you just want a string, with no fancy formatting, you can call str() on a BeautifulSoup object (unicode() in Python 2), or on a Tag within it: str(soup) # ' ..