[cs615asa] Wrong Answers

Jason Ajmo jajmo at stevens.edu
Fri Apr 6 15:57:16 EDT 2018


Matt,

That makes sense, however, a case-insensitive, unique sort on column 2
still yields an incorrect answer:
# gzcat data.gz | grep "^en[\. ]" | sort -ufk 2 | wc -l
 2232660


On Fri, Apr 6, 2018 at 3:49 PM Matthew Gomez <mgomez1 at stevens.edu> wrote:

> I read the question as “the second column must be unique” and there are
> things like Main_page and Main_Page which I assume are the same object. So
> I was uniqing on the second column.
>
> Matt
>
> On Fri, Apr 6, 2018 at 3:43 PM Jason Ajmo <jajmo at stevens.edu> wrote:
>
>> I've gotten 3-5, but I'm struggling a little on 1 and 2. I've tried many
>> different variations, but can't seem to get the right answer. Here are my
>> two that I feel are "most" correct.
>>
>> 1:
>> # gzcat data.gz | grep "^en[\. ]" | wc -l
>>  2233318
>> I didn't feel like sorting or `uniq`ing were necessary since each row
>> should be unique as it is.
>>
>> 2:
>> # gzcat data.gz | grep "^en[\. ]" | awk '{ print $2 " " $(NF - 1) }' |
>> sort -nrk 2 | head -n 1
>> en 3127515
>> For this one, I had to do a little data transformation with awk since
>> using sort with -k 3 and no awk was giving clearly incorrect results.
>>
>> The initial gzcat and grep are correct, since it's the foundation I used
>> for 3-5. Any feedback on my statements above would be appreciated.
>>
>> Thanks.
>> --
>> Jason Ajmo
>> Stevens Institute of Technology
>> B.S. Cybersecurity '17
>> M.S. Computer Science '18
>> 0x56FA3123
>>
> _______________________________________________
>> cs615asa mailing list
>> cs615asa at lists.stevens.edu
>> https://lists.stevens.edu/mailman/listinfo/cs615asa
>>
> _______________________________________________
> cs615asa mailing list
> cs615asa at lists.stevens.edu
> https://lists.stevens.edu/mailman/listinfo/cs615asa
>
-- 
Jason Ajmo
Stevens Institute of Technology
B.S. Cybersecurity '17
M.S. Computer Science '18
0x56FA3123
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.stevens.edu/pipermail/cs615asa/attachments/20180406/04f61c21/attachment.html>


More information about the cs615asa mailing list