I want to return whether some company was acquired and by whom
You can scrape the crunchbase website to get this information.The downside is that you will be limiting your search to their site. To extend this you could perhaps include some other sites also.
import requests
from bs4 import BeautifulSoup
import re
while True:
print()
organization_name=input('Enter organization_name: ').strip().lower()
crunchbase_url='
https://www.crunchbase.com/organization/'+organization_name
headers={
'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36'
}
r=requests.get(crunchbase_url,headers=headers)
if r.status_code == 404:
print('This organization is not available\n')
else:
soup=BeautifulSoup(r.text,'html.parser')
overview_h2=soup.find('h2',text=re.compile('Overview'))
try:
possible_acquired_by_span=overview_h2.find_next('span',class_='bigValueItemLabelOrData')
if possible_acquired_by_span.text.strip() == 'Acquired by':
acquired_by=possible_acquired_by_span.find_next('span',class_='bigValueItemLabelOrData').text.strip()
else:
acquired_by=False
except Exception as e:
acquired_by=False
# uncomment below line if you want to see the error
# print(e)
if acquired_by:
print('Acquired By: '+acquired_by+'\n')
else:
print('No acquisition information available\n')
again=input('Do You Want To Continue? ').strip().lower()
if again not in ['y','yes']:
break
Sample Output:
Enter organization_name: Marketo
Acquired By: Adobe Systems
Do You Want To Continue? y
Enter organization_name: Facebook
No acquisition information available
Do You Want To Continue? y
Enter organization_name: FakeCompany
This organization is not available
Do You Want To Continue? n
Notes
Read the crunchbase Terms and seek their consent before you deploy this in any commercial projects.
Also checkout the crunchbase api - I think this will be the legit way to go forward with what you are asking for.