in Education by
I am trying to create a java program that performs a login against an achievo instance. I am trying to use Screen Scraping. I manage to login using the following code: @Test public void testLogin() throws Exception { HashMap data = new HashMap(); data.put("auth_user", "user"); data.put("auth_pw", "password"); doSubmit("https://someurl.com/achievo/index.php", data); } private void doSubmit(String url, HashMap data) throws Exception { URL siteUrl = new URL(url); HttpsURLConnection conn = (HttpsURLConnection) siteUrl.openConnection(); conn.setRequestMethod("POST"); conn.setDoOutput(true); conn.setDoInput(true); //conn.setRequestProperty( "User-agent", "spider" ); //conn.setRequestProperty("User-agent", "Opera/9.80 (X11; Linux i686; U; en) Presto/2.7.62 Version/11.01"); conn.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 1.2.30703)"); DataOutputStream out = new DataOutputStream(conn.getOutputStream()); Set keys = data.keySet(); Iterator keyIter = keys.iterator(); StringBuilder content = new StringBuilder(""); for(int i=0; keyIter.hasNext(); i++) { Object key = keyIter.next(); if(i!=0) { content.append("&"); } content.append(key + "=" + URLEncoder.encode(data.get(key), "UTF-8")); } System.out.println(content.toString()); out.writeBytes(content.toString()); out.flush(); out.close(); BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream())); String line = ""; while((line=in.readLine())!=null) { System.out.println(line); } in.close(); } However, when achievo successfully logs-in, I get redirected to the main page where it says: Achievo

Your browser doesnt support frames, but this is required to run Achievo

Obviously I get the Your browser doesnt support frames, but this is required to run Achievo. I have tried to directly access the dispatch.php frame, as this is what I probably want, however, it reports that my session has expired, and that I need to re-login. Is there someway to fake a frame? Or somehow keep the connection, change the url, and try to get the dispatch.php frame? Using HtmlUnit, I have done the following: WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3); HtmlPage page = webClient.getPage("https://someurl.com/index.php"); System.out.println(page.asXml()); List forms = page.getForms(); assertTrue(forms != null && !forms.isEmpty()); HtmlForm form = forms.get(0); HtmlSubmitInput submit = form.getInputByName("login"); HtmlInput inputUsername = form.getInputByName("auth_user"); HtmlInput inputPw = form.getInputByName("auth_pw"); inputUsername.setValueAttribute("foo"); inputPw.setValueAttribute("bar"); HtmlPage page2 = submit.click(); CookieManager cookieManager = webClient.getCookieManager(); Set cookies = cookieManager.getCookies(); System.out.println("Is cookie " + cookieManager.isCookiesEnabled()); for(Cookie cookie : cookies) { System.out.println(cookie.toString()); } System.out.println(page2.asXml()); webClient.closeAllWindows(); Here I get the form, I submit it, and I retrieve the same message. When I also print out, I can see that I have a cookie. Now the question is, how do I proceed to get the dispatch.php frame using the logged in cookie? JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
This kind of scraping is a bit complicated, there are several factors to think about. Does the Achieve app set any cookies? If so, you will need to accept them and send them with the next request. I think By the looks of things, you will need to parse that HTML page and extract the frame you wish to load. I suspect you're getting back a session expired message because you're not sending a cookie or something like that. You need to make sure you use the exact URL provided in the FRAMESET. I suggest using the Apache HttpClient module which is a bit more fully-featured than the standard Java URL provider, and can manage things like cookies for you.

Related questions

0 votes
    I am trying to create a java program that performs a login against an achievo instance. I am trying ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 27, 2022 in Education by JackTerrance
0 votes
    which of the following is used to display an alternative content in case browser does not support frames. a. no ... d. true frame Select the correct answer from above options...
asked Dec 13, 2021 in Education by JackTerrance
0 votes
    why android devolepor doesn’t use support android library for a large project? Select the correct answer from above options...
asked Dec 11, 2021 in Education by JackTerrance
0 votes
    Selenium doesn’t support the following programming language: 1. Python 2. C# 3. C 4. Java...
asked Jul 11, 2021 in Technology by JackTerrance
0 votes
    QTP doesn’t have any inbuilt support to connect to databases. Then, how can you connect to a database?...
asked Oct 19, 2020 in Technology by JackTerrance
0 votes
    I'm trying to incorporate Google Maps into my Access form so that for every record a map of its ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jan 13, 2022 in Education by JackTerrance
0 votes
    What's concept of detecting support of any css pseudo-class in browser through JavaScript? Exactly, I want ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 17, 2022 in Education by JackTerrance
0 votes
    As the fact there are some historical reason and bugs in some browsers (desktop and mobile), not all ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 7, 2022 in Education by JackTerrance
0 votes
    As the fact there are some historical reason and bugs in some browsers (desktop and mobile), not all ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Feb 5, 2022 in Education by JackTerrance
0 votes
    The four most frequently used types of data objects in R are vectors, matrices, data frames and ________ (a) ... Out of R Programming Select the correct answer from above options...
asked Feb 13, 2022 in Education by JackTerrance
0 votes
    By what function we can create data frames? (a) Data.frames() (b) Data.sets () (c) Function () (d) C ... In and Out of R Programming Select the correct answer from above options...
asked Feb 13, 2022 in Education by JackTerrance
0 votes
    Data frames can have additional attributes such as __________ (a) Rowname() (b) Rownames() (c) R.names() ( ... and Out of R Programming Select the correct answer from above options...
asked Feb 13, 2022 in Education by JackTerrance
0 votes
    ______ let's you perform SQL queries on your R data frames. (a) sqldf (b) plyr (c) forecast (d) ... Linear Regression of R Programming Select the correct answer from above options...
asked Feb 10, 2022 in Education by JackTerrance
0 votes
    I have code that at one place ends up with a list of data frames which I really want to convert ... starting with (this is grossly simplified for illustration): listOfDataFrames...
asked Feb 8, 2022 in Education by JackTerrance
0 votes
    I have the following data frame empid...
asked Jan 25, 2022 in Education by JackTerrance
...