7 kyu

NLP-Series #1 - Inverted Index

194 of 342ProcrasTech
Description
Loading description...
Fundamentals
Parsing
Regular Expressions
  • Please sign in or sign up to leave a comment.
  • saudiGuy Avatar

    Python: Random tests are vulnerable to input modification

  • laurelis24 Avatar

    Like others, i don't understand what I need to do. xD

  • ejini战神 Avatar

    Another one of those guesswork katas..........

    • Description does not mention what kind of character range constitutes the term

    • Does not mention how to filter based on caseSensitive flag (tho the name itself is self-relevant but that should not be an excuse for not having an explanation)

    • Does not mention how to filter based on exactMatch (What should the perceding and leading characters be?? Does the term being at the front or end of the words or both to be considered as an exact match??

    • Description should be language-agnostic

    • The title has nothing to do with the task, in fact brings more misunderstanding and fuss ~~

    • This has no way near 7kyu, considered the fact that it uses advanced regex operations && compared to a lot of 6kyu katas which just do string replacement or transfomation

    Hence, discussion to send back to beta / retirement has been initiated!

  • inyoot Avatar

    I love this kata. There are several ways to improve it, though. I think the instruction need to be more detailed. How can I can edit the instructions?

    • ProcrasTech Avatar

      Hey, sorry for being absent. I was busy moving to another city. I do not know how or if i can give you edit rights. What would be your suggested instructions? Greetings and stay safe

    • inyoot Avatar

      Hi,

      I already forget how to improve it. I am thinking about Term Frequency – Inverse Document Frequency (TF/IDF) when I saw this kata. Let me redo this kata and get back to you. I don't know about edit right. I don't even know how to create my own kata yet... :D

    • ejini战神 Avatar

      Reraised above ~~

      Suggestion marked resolved by ejini战神 3 years ago
  • Puck Avatar

    Python translation 🐍

    Please review and approve.

  • JohanWiltink Avatar

    Passing an Array of Strings but expecting a comma-separated string to be returned seems inconsistent. And, plain and simple, an inappropriate use of datatypes.

    Can I make a case for expecting an Array of Numbers to be returned while we're still in Beta?

    • FArekkusu Avatar

      This is my first Kata.

      Maybe the author doesn't know about Test.assertSimilar and Test.assertDeepEquals so he doesn't know how to properly compare arrays? (Do you really think this is an issue and not a suggestion?)

      I agree with Johan. Returning an array is simpler, more logical and is not as ugly as a comma-separated string without spaces v_v

    • JohanWiltink Avatar

      It was up for immediate approval, that's why I made it an issue. It should be a suggestion, but then someone would grab a point and we would no longer be in Beta.

    • ProcrasTech Avatar

      And, plain and simple, an inappropriate use of datatypes.

      I can't argue against that. There was a reason to return a string that vanished once i kinda figured out how the test framework works (never used one before). Also, i thought publishing would push the kata in the draft stage (3rd day on Codewars, learning the ropes)

      Edit: Changed return datatype to array. Waiting for outside confirmation ;)

    • JohanWiltink Avatar

      Ah no, saving creates a Draft, publishing creates a Beta. Which is entirely the correct thing to do. :] Getting ( enough ) upvotes and solutions then, at some point, creates an approvable kata, and somebody with enough points can then approve it. Unresolved issues prevent being approvable ( well, until you have a lot of solutions and upvotes ). That's more or less how that works, in a nutshell. Resolving an issue is just a check mark under a post, you don't need to actually solve the problem. ( Most of the time, you should, of course. :P ) Also, raising an issue should not be considered a personal attack on author / translator / kata.

      I understand you've discovered Test.assertDeepEquals. Good. :] It's practically the only test you really need ( it can function as assertEquals as well ), besides, much less often, assertApproxEquals and expectError.

    • JohanWiltink Avatar

      Looking good. Closing. :]

      ETA: and approving.

      Issue marked resolved by JohanWiltink 7 years ago
    • ProcrasTech Avatar

      Haha, now im giddy. Honestly, i didn't think anybody would care about my kata. This community is amazing hitherto!

  • FArekkusu Avatar

    The description is lacking. It should explain what we are asked to do, not just say "inverted index is very cool, go find out yourself what it is".

    • ProcrasTech Avatar

      I've updated the description. Is that better?

    • FArekkusu Avatar

      Uhh, I can't really see any changes. The problem is you don't specify what "inverted index" is. As I understood this is basically a 1-based indexing of strings which correspond to some pattern, but you don't explain it properly anywhere.

    • ProcrasTech Avatar

      Instead of having an index that links to terms a document contains (because massive), create an index for each term linking to the documents where that term occurs.
      index: document -> term1,term2,term3
      inv index: term -> document1,document2,document3
      Sry, i struggle wording this
      Have a look at this
      I'll try to come up with a better description when i get home

    • FArekkusu Avatar

      I understood what it meant and just solved it. I also changed the description to be more clear.

      Suggestion marked resolved by FArekkusu 7 years ago
    • ProcrasTech Avatar

      Perfect. Thanks!

  • ZED.CWT Avatar
    Test.expect(contains, 'not all document indices found')
    

    While using expect, if some cases failed, no one will know the expected value. assertEquals should be used.

    Actually there are no need to test these separately

    it ("inverted index contains document indices", function(){
    it ("inverted index is ordered", function(){
    it ("inverted index does not contain duplicates", function(){
    

    Because sorted in ascending order seperated by a comma is already required in the desc. assertEquals should and only be used.

  • ZED.CWT Avatar
    Test.randomNumber%3===0
    

    randomNumber is not a getter but a method