omz:forum

    • Register
    • Login
    • Search
    • Recent
    • Popular

    Welcome!

    This is the community forum for my apps Pythonista and Editorial.

    For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.


    re (Regular Expression) module "caret" character not working?

    Editorial
    4
    7
    3389
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • TutorialDoctor
      TutorialDoctor last edited by

      In the code below, the action should replace all words that begin with "the" as a list. But it returns a blank list.

      <pre>
      # Extracts words from input text and outputs it as a list
      import re
      import editor
      import workflow

      params = workflow.get_parameters()
      sentence = workflow.get_input()

      list = []

      expression = '^the'
      pattern = re.compile(expression)
      matches = re.findall(pattern,sentence)

      for word in matches:
      list.append(word)

      workflow.set_output('\n'.join(list))

      </pre>

      This code works with other special characters, but this isn't working. Any tips?

      1 Reply Last reply Reply Quote 0
      • JonB
        JonB last edited by

        Caret matches only at the start of a string, I.e if the is the first work in the sentence.
        The expression you are looking for probably looks like

        expression='\bthe'
        

        \b matches but does not consume a word boundary.

        http://regex101.com/r/gC6nN8/1

        1 Reply Last reply Reply Quote 0
        • ccc
          ccc last edited by

          Remember that you can also simplify things by using list comprehensions:

          # Instead of this:
          list = []
          
          for word in matches:
              list.append(word)
          
          workflow.set_output('\n'.join(list))
          
          # You can just write this:
          workflow.set_output('\n'.join([word for word in matches]))
          
          1 Reply Last reply Reply Quote 0
          • omz
            omz last edited by

            @ccc The list comprehension seems redundant here, '\n'.join(matches) would do the same, as far as I can see.

            Btw, it's not a good idea to use list as a variable name, you'll run into problems when you try to use the built-in function list().

            1 Reply Last reply Reply Quote 0
            • TutorialDoctor
              TutorialDoctor last edited by

              I am still not getting a match on the word "them" nor on an email that begins with "the."

              <pre>

              Extracts words from input text and outputs it as a list

              import re
              import editor
              import workflow

              params = workflow.get_parameters()
              sentence = workflow.get_input()

              match_list = []

              expression = '\bthe'
              pattern = re.compile(expression)
              matches = re.findall(pattern,sentence)

              for word in matches:
              match_list.append(word)

              workflow.set_output('\n'.join(match_list))

              </pre>

              1 Reply Last reply Reply Quote 0
              • omz
                omz last edited by

                You need to escape the backslash in the pattern or use a raw string, i.e. use either '\\bthe' or r'\bthe'.

                1 Reply Last reply Reply Quote 0
                • TutorialDoctor
                  TutorialDoctor last edited by

                  Thanks ole. That did help.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  Powered by NodeBB Forums | Contributors