-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profile scraping error: res = description.find_element(By.TAG_NAME,"a").find_elements(By.XPATH,"*") #178
Comments
Can you provide the code that you've used please? |
Sorry, I forgot to include the failing account at the first place. This bug occurred at this profile: https://www.linkedin.com/in/sheanahamill/. from selenium.common.exceptions import WebDriverException
from selenium import webdriver
from linkedin_scraper import Person, actions, Company
from selenium.webdriver.chrome.service import Service as ChromeService
from webdriver_manager.chrome import ChromeDriverManager
import time, pickle
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("user-data-dir=C:\\Users\\User\\AppData\\Local\\Google\\Chrome\\User Data\\Profile 3")
driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()), options=options)
person=Person("https://www.linkedin.com/in/sheanahamill", driver=driver, scrape=False)
time.sleep(3)
person.scrape(close_on_complete=False)
name=person.name
title=person.job_title
now_company=person.company
print(name, title, now_company)
experience=person.experiences
print(experience)
current_company=experience[0]
print(current_company)
link_to_company=current_company.linkedin_url
print(link_to_company)
location=current_company.location
print(location)
company=Company(link_to_company, driver=driver, get_employees=False, close_on_complete=False)
company_name=company.name
company_size=company.company_size
company_website=company.website
about=company.about_us
print(company_name, company_size, company_website, about) this code works fine with other account(other that log1 problem from #173) |
Hey - I only updated two functions as I needed: get_experiences() and get_name_and_location(). In addition to UI updates I also fixed the scraper issue where it gets confused when a person has multiple positions at the same company over time. You can selectively scrape by doing this:
This is from ~ a week ago, hopefully still working. |
Still facing same issue even with this update |
I just deployed a fix. Please try with v2.11.2 please |
Thanks it works, I tested it on 2 profiles but havent tested at scale yet.
And now it works for me |
hi it is not working i have changed as it shown properties "https://www.linkedin.com/company/google" i was checking |
The way I troubleshooted it is:
|
When scraping the person who worked multiple times at organization this error occurred.
I checked the page structure and it should work fine but for some reason it fails.
This part of code causes the problem:
if position_summary_text and len(position_summary_text.find_element(By.CLASS_NAME,"pvs-list").find_elements(By.XPATH,"li")) > 1: #.find_element(By.CLASS_NAME,"pvs-list") descriptions = position_summary_text.find_element(By.CLASS_NAME,"pvs-list").find_element(By.CLASS_NAME,"pvs-list").find_elements(By.XPATH,"li") for description in descriptions: res = description.find_element(By.TAG_NAME,"a").find_elements(By.XPATH,"*") position_title_elem = res[0] if len(res) > 0 else None work_times_elem = res[1] if len(res) > 1 else None location_elem = res[2] if len(res) > 2 else None
it cant find res by tag name a.
As far as I understood it tries to find the top part of the job description(title, duration at position, location) and all this is located under a tag on the web page. @joeyism do you have any insights on that? Am I referring correctly to the part of the page that this code is trying to analyse?
The whole error message:
`Traceback (most recent call last):
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\lists_check.py", line 23, in
person.scrape(close_on_complete=False)
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\venv\lib\site-packages\linkedin_scraper\person.py", line 89, in scrape
self.scrape_logged_in(close_on_complete=close_on_complete)
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\venv\lib\site-packages\linkedin_scraper\person.py", line 285, in scrape_logged_in
self.get_experiences()
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\venv\lib\site-packages\linkedin_scraper\person.py", line 156, in get_experiences
res = description.find_element(By.TAG_NAME,"a").find_elements(By.XPATH,"*")
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\venv\lib\site-packages\selenium\webdriver\remote\webelement.py", line 417, in find_element
return self._execute(Command.FIND_CHILD_ELEMENT, {"using": by, "value": value})["value"]
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\venv\lib\site-packages\selenium\webdriver\remote\webelement.py", line 395, in _execute
return self._parent.execute(command, params)
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 346, in execute
self.error_handler.check_response(response)
File "C:\Users\User\PycharmProjects\pythonProject\pythonProject\venv\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 245, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"tag name","selector":"a"}
(Session info: chrome=114.0.5735.134); For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#no-such-element-exception
Stacktrace:
Backtrace:
GetHandleVerifier [0x0025A813+48355]
(No symbol) [0x001EC4B1]
(No symbol) [0x000F5358]
(No symbol) [0x001209A5]
(No symbol) [0x00120B3B]
(No symbol) [0x00119AE1]
(No symbol) [0x0013A784]
(No symbol) [0x00119A36]
(No symbol) [0x0013AA94]
(No symbol) [0x0014C922]
(No symbol) [0x0013A536]
(No symbol) [0x001182DC]
(No symbol) [0x001193DD]
GetHandleVerifier [0x004BAABD+2539405]
GetHandleVerifier [0x004FA78F+2800735]
GetHandleVerifier [0x004F456C+2775612]
GetHandleVerifier [0x002E51E0+616112]
(No symbol) [0x001F5F8C]
(No symbol) [0x001F2328]
(No symbol) [0x001F240B]
(No symbol) [0x001E4FF7]
BaseThreadInitThunk [0x762B0099+25]
RtlGetAppContainerNamedObjectPath [0x77A97B6E+286]
RtlGetAppContainerNamedObjectPath [0x77A97B3E+238]
(No symbol) [0x00000000]
Process finished with exit code 1
`
The text was updated successfully, but these errors were encountered: