omz:forum

    • Register
    • Login
    • Search
    • Recent
    • Popular

    Welcome!

    This is the community forum for my apps Pythonista and Editorial.

    For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.


    Markov chain text?

    Pythonista
    3
    5
    2418
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • chrisburdick
      chrisburdick last edited by

      Is there any way in Pythonista to create a Markov chain text generator (or something resembling one)? Something, in other words, that will take user-provided text (either via imported document or copy/paste) and use that as a source library for generating random semi-coherent sentences?

      1 Reply Last reply Reply Quote 0
      • omz
        omz last edited by

        I have sample code for Markov chain text somewhere, just need to clean it up a bit.

        1 Reply Last reply Reply Quote 0
        • omz
          omz last edited by omz

          This generates some interesting, pseudo-random text from Tolstoi's Anna Karenina:

          #!python3
          
          # Adapted from this blog post: http://agiliq.com/blog/2009/06/generating-pseudo-random-text-with-markov-chains-u/
          
          import random
          import os
          import urllib.request
          
          class Markov(object):
          	
          	def __init__(self, open_file):
          		self.cache = {}
          		self.open_file = open_file
          		self.words = self.file_to_words()
          		self.word_size = len(self.words)
          		self.database()
          		
          	
          	def file_to_words(self):
          		self.open_file.seek(0)
          		data = self.open_file.read()
          		words = data.split()
          		return words
          		
          	
          	def triples(self):
          		""" Generates triples from the given data string. So if our string were
          				"What a lovely day", we'd generate (What, a, lovely) and then
          				(a, lovely, day).
          		"""
          		
          		if len(self.words) < 3:
          			return
          		
          		for i in range(len(self.words) - 2):
          			yield (self.words[i], self.words[i+1], self.words[i+2])
          			
          	def database(self):
          		for w1, w2, w3 in self.triples():
          			key = (w1, w2)
          			if key in self.cache:
          				self.cache[key].append(w3)
          			else:
          				self.cache[key] = [w3]
          				
          	def generate_markov_text(self, size=25):
          		while True:
          			seed = random.randint(0, self.word_size-3)
          			seed_word = self.words[seed]
          			if seed_word[0].isupper():
          				break		
          		seed_word, next_word = self.words[seed], self.words[seed+1]
          		w1, w2 = seed_word, next_word
          		gen_words = []
          		while not w2.endswith('.'):
          			gen_words.append(w1)
          			w1, w2 = w2, random.choice(self.cache[(w1, w2)])
          		gen_words.append(w2)
          		return ' '.join(gen_words)
          			
          def main():
          	if not os.path.exists('anna_karenina.txt'):
          		print('Downloading book...')
          		urllib.request.urlretrieve('http://www.gutenberg.org/files/1399/1399-0.txt', 'anna_karenina.txt')
          		
          	with open('anna_karenina.txt', 'r', encoding='utf-') as f:
          		markov = Markov(f)
          		print(markov.generate_markov_text())
          
          if __name__ == '__main__':
          	main()
          
          Phuket2 1 Reply Last reply Reply Quote 0
          • Phuket2
            Phuket2 @omz last edited by

            @omz , was fun to look at this code and run it. But then I went to http://www.gutenberg.org to see other books. What a messy site for such an important resource. Surprising no one has offered to re do it using bootstrap or another type of framework. Just saying...

            1 Reply Last reply Reply Quote 0
            • chrisburdick
              chrisburdick last edited by

              Thanks, ole! I'll give it a go!

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              Powered by NodeBB Forums | Contributors