[cs615asa] [CS615] HW5 Handling Garbage Input Properly

Matthew Gomez mgomez1 at stevens.edu
Fri Apr 6 13:35:39 EDT 2018


I’m not sure if we’re supposed to be looking for “en” or “en.*”. I worked
on it for about 6.5 hours yesterday and couldn’t get the first answer.

Matt

On Fri, Apr 6, 2018 at 12:15 PM Patrick Murray <pmurray1 at stevens.edu> wrote:

> After having an incredibly difficult time attempting to compute answers
> for the first part of the assignment, I've decided to use Python to solve
> the correct answers - prior to implementing using Unix tools.
>
> How should we handle the tokenization of malformed input such as the
> following line (3132224)?
>
> en dÃÃâ€Â
> ’ÂÂÃâ� 1 4867
>
> Note that the page title contains a white space delimiter.
>
> Best,
> Pat
> _______________________________________________
> cs615asa mailing list
> cs615asa at lists.stevens.edu
> https://lists.stevens.edu/mailman/listinfo/cs615asa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.stevens.edu/pipermail/cs615asa/attachments/20180406/12cdff56/attachment.html>


More information about the cs615asa mailing list