in Education by
I have a view like this in Hive: id sequencenumber appname 242539622 1 A 242539622 2 A 242539622 3 A 242539622 4 B 242539622 5 B 242539622 6 C 242539622 7 D 242539622 8 D 242539622 9 D 242539622 10 B 242539622 11 B 242539622 12 D 242539622 13 D 242539622 14 F I'd like to have, per each id, the following view: id sequencenumber appname appname_c 242539622 1 A A 242539622 2 A A 242539622 3 A A 242539622 4 B B_1 242539622 5 B B_1 242539622 6 C C 242539622 7 D D_1 242539622 8 D D_1 242539622 9 D D_1 242539622 10 B B_2 242539622 11 B B_2 242539622 12 D D_2 242539622 13 D D_2 242539622 14 F F Or anything close to this, that can identify re-occurrence of a given event in the sequence. My ultimate goal is to calculate time spent in each group of events (or state if you wish in the context of Markov modeling) taking into account if there is any loop-back. For example, time spent in B_1 in the above example can be very compared to B_2. Have searched window functions in Hive (link) but I think they cannot to conduct row-wise comparisons like R/Python does. JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
Solution using Hive window functions. I used your data to test it, remove your_table CTE and use your table instead. The result is as expected. with your_table as (--remove this CTE, use your table instead select stack(14, '242539622', 1,'A', '242539622', 2,'A', '242539622', 3,'A', '242539622', 4,'B', '242539622', 5,'B', '242539622', 6,'C', '242539622', 7,'D', '242539622', 8,'D', '242539622', 9,'D', '242539622',10,'B', '242539622',11,'B', '242539622',12,'D', '242539622',13,'D', '242539622',14,'F' ) as (id,sequencenumber,appname) ) --remove this CTE, use your table instead select id,sequencenumber,appname, case when sum(new_grp_flag) over(partition by id, group_name) = 1 then appname --only one group of consequent runs exists (like A) else nvl(concat(group_name, '_', sum(new_grp_flag) over(partition by id, group_name order by sequencenumber) --rolling sum of new_group_flag ),appname) end appname_c from ( select id,sequencenumber,appname, case when appname=prev_appname or appname=next_appname then appname end group_name, --identify group of the same app case when appname<>prev_appname or prev_appname is null then 1 end new_grp_flag --one 1 per each group from ( select id,sequencenumber,appname, lag(appname) over(partition by id order by sequencenumber) prev_appname, --need these columns lead(appname) over(partition by id order by sequencenumber) next_appname --to identify groups of records w same app from your_table --replace with your table )s )s order by id,sequencenumber ; Result: OK id sequencenumber appname appname_c 242539622 1 A A 242539622 2 A A 242539622 3 A A 242539622 4 B B_1 242539622 5 B B_1 242539622 6 C C 242539622 7 D D_1 242539622 8 D D_1 242539622 9 D D_1 242539622 10 B B_2 242539622 11 B B_2 242539622 12 D D_2 242539622 13 D D_2 242539622 14 F F Time taken: 232.319 seconds, Fetched: 14 row(s)

Related questions

0 votes
    I have col1 in a pandas df. I want to make col2: col1 col2 1 1 1 2 1 3 1 4 2 2 ... questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 22, 2022 in Education by JackTerrance
0 votes
    How will you replaces all occurrences of old substring in string with new string?...
asked Nov 26, 2020 in Technology by JackTerrance
0 votes
    What's the simplest way to count the number of occurrences of a character in a string? e.g. count the number of ... had a little lamb' Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
0 votes
    In Hive, when we do a query (like: select * from employee), we do not get any column names in the ... when you execute any query? Select the correct answer from above options...
asked Jan 28, 2022 in Education by JackTerrance
0 votes
    I have an item and I want to count it's occurrence in a list, How can I do that in Python? Select the correct answer from above options...
asked Jan 22, 2022 in Education by JackTerrance
0 votes
0 votes
0 votes
    I currently working on a laravel 5.4 project where I'm trying to find values in my database that ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 21, 2022 in Education by JackTerrance
0 votes
    Which of the following method make vector of repeated values? (a) rep() (b) data() (c) view() (d ... Linear Regression of R Programming Select the correct answer from above options...
asked Feb 10, 2022 in Education by JackTerrance
0 votes
    write a program to accept a word and display the new word after removing all the repeated alphabet Select the correct answer from above options...
asked Dec 25, 2021 in Education by JackTerrance
0 votes
    Write T for true and F for false for the following statements. 1. A byte has 8 bits. 2. Repeated division-by- ... digits, 0 through 7. Select the correct answer from above options...
asked Dec 19, 2021 in Education by JackTerrance
0 votes
    Write T for true and F for false for the following statements. 1. A byte has 8 bits. 2. Repeated division-by- ... digits, 0 through 7. Select the correct answer from above options...
asked Dec 18, 2021 in Education by JackTerrance
0 votes
    If the poles or zeros are not repeated, then the function is said to be having __________ poles or ________ zeros. ... GATE EC Exam, Network Theory MCQ (Multiple Choice Questions)...
asked Oct 16, 2021 in Education by JackTerrance
0 votes
    If the poles or zeros are repeated, then the function is said to be having __________ poles or ________ zeros. ( ... GATE EC Exam, Network Theory MCQ (Multiple Choice Questions)...
asked Oct 16, 2021 in Education by JackTerrance
0 votes
    In a _________ the system makes repeated requests for tuples from the operation at the top of the pipeline ... topic in section Query Processing Techniques of Database Management...
asked Oct 10, 2021 in Education by JackTerrance
...