omz:forum

    • Register
    • Login
    • Search
    • Recent
    • Popular

    Welcome!

    This is the community forum for my apps Pythonista and Editorial.

    For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.


    Parsing YAML with Python

    Editorial
    3
    11
    11545
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jrh147
      jrh147 last edited by

      I have a simple Markdown file with YAML header that looks something like this:

      ---
      title: "Mary had a little lamb"
      link: http://google.com
      ---
      

      I want to extract some of the YAML data to use later in a workflow. So far I have this but it just gives me errors. Ideally once I extract the title key I would also like to strip the quotation marks but not sure how to do that yet either. Maybe there is an easier way?

      #coding: utf-8
      import console
      import yaml
      import editor
      
      from StringIO import StringIO
      text = StringIO(editor.get_text())
      
      doc = list(yaml.load_all(text))
      
      tweet_link = doc["link"]
      tweet_title = doc["title"]
      
      
      console.hud_alert(tweet_link)
      
      1 Reply Last reply Reply Quote 0
      • ccc
        ccc last edited by

        Try this:

        my_dict = yaml.load(editor.get_text())
        
        1 Reply Last reply Reply Quote 0
        • jrh147
          jrh147 last edited by

          @ccc said:

          Try this:

          my_dict = yaml.load(editor.get_text())

          I got the error code "expected a single document, but found another document" pointing to the "---" as the culprits

          1 Reply Last reply Reply Quote 0
          • ccc
            ccc last edited by ccc

            my_dict = yaml.load(editor.get_text().replace('---', ''))

            1 Reply Last reply Reply Quote 0
            • jrh147
              jrh147 last edited by

              That worked great. For anyone following along this is what I ended up with. It checks to see whether or not the key 'link' exists before adding it to the clipboard as well. Thanks!

              #coding: utf-8
              import yaml
              import editor
              import clipboard
              
              m = yaml.load(editor.get_text().replace('-', ''))
              
              tweet = m['title']
              
              if "link" in m:
              	tweet = tweet + ' ' + m['link']
              
              clipboard.set(tweet)
              
              1 Reply Last reply Reply Quote 0
              • ccc
                ccc last edited by

                tweet += ' ' + m['link']

                1 Reply Last reply Reply Quote 0
                • jrh147
                  jrh147 last edited by

                  Oh nice. Thanks! First weekend using Python (if you couldn't tell)

                  And I spoke too soon. It worked fine if my text only had YAML front matter. If it has anything after the second '---' I get the "Error Scanner while scanning a block scalar ... " Error

                  I think this is because it expects the entire thing to be YAML. Anyway I can stop scanning up to the second ---.

                  1 Reply Last reply Reply Quote 0
                  • ccc
                    ccc last edited by ccc

                    str.partition() is your friend...

                    yaml_text = editor.get_text().rpartition('---')[0]
                    yaml_dict = yaml.load(yaml_text.partition('---')[2] or yaml_text)
                    

                    I would avoid short, nondescriptive names like m especially in your early days of programming. Instead, I would encourage you to use variable names that help you to know the origin or use of the data without having to write comments. This will accelerate your ability to write more complex logic.

                    Acky 1 Reply Last reply Reply Quote 1
                    • Acky
                      Acky @ccc last edited by Acky

                      A markdown document might contain additional --- for horizontal lines. I think this would break the code of @ccc. If the document is well formed and wrapping the YAML front matter in two --- the following line should work even with more --- in the text:

                      yaml_dict = yaml.load(editor.get_text().partition('---')[2].partition('---')[0])
                      

                      I'm a python novice myself, so I hope I'm not talking nonsense.

                      1 Reply Last reply Reply Quote 0
                      • ccc
                        ccc last edited by

                        @Acky, I wanted to make sure it would work if there are no dividing lines. Between what you have written and what I have written, @jrh147 should be able to find a solution to fit the requirements.

                        • No horizontal lines, ccc works, Acky does not work
                        • 0ne horizontal line, ccc takes text before horizontal line, Acky takes text after horizontal line
                        • Two horizontal lines , ccc works, Acky works
                        • Thee or more horizontal lines, ccc takes all text between first and last horizontal lines, Acky takes all text between first and second lines.
                        1 Reply Last reply Reply Quote 1
                        • jrh147
                          jrh147 last edited by

                          Wow. Amazingly thorough help. Thank you so much. I hadn't thought of additional "---" but appreciate the code should that be a problem in the future. Everything works as expected so again, thank you.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Powered by NodeBB Forums | Contributors