[cs615asa] Wrong Answers

Matthew Gomez mgomez1 at stevens.edu
Fri Apr 6 15:49:07 EDT 2018


I read the question as “the second column must be unique” and there are
things like Main_page and Main_Page which I assume are the same object. So
I was uniqing on the second column.

Matt

On Fri, Apr 6, 2018 at 3:43 PM Jason Ajmo <jajmo at stevens.edu> wrote:

> I've gotten 3-5, but I'm struggling a little on 1 and 2. I've tried many
> different variations, but can't seem to get the right answer. Here are my
> two that I feel are "most" correct.
>
> 1:
> # gzcat data.gz | grep "^en[\. ]" | wc -l
>  2233318
> I didn't feel like sorting or `uniq`ing were necessary since each row
> should be unique as it is.
>
> 2:
> # gzcat data.gz | grep "^en[\. ]" | awk '{ print $2 " " $(NF - 1) }' |
> sort -nrk 2 | head -n 1
> en 3127515
> For this one, I had to do a little data transformation with awk since
> using sort with -k 3 and no awk was giving clearly incorrect results.
>
> The initial gzcat and grep are correct, since it's the foundation I used
> for 3-5. Any feedback on my statements above would be appreciated.
>
> Thanks.
> --
> Jason Ajmo
> Stevens Institute of Technology
> B.S. Cybersecurity '17
> M.S. Computer Science '18
> 0x56FA3123
> _______________________________________________
> cs615asa mailing list
> cs615asa at lists.stevens.edu
> https://lists.stevens.edu/mailman/listinfo/cs615asa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.stevens.edu/pipermail/cs615asa/attachments/20180406/70a3ab20/attachment-0001.html>


More information about the cs615asa mailing list