omz:forum

    • Register
    • Login
    • Search
    • Recent
    • Popular

    Welcome!

    This is the community forum for my apps Pythonista and Editorial.

    For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.


    Returning Execution to Main Thread with ui.WebView

    Pythonista
    ui.webview requests scraping
    3
    5
    5538
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • frenchesco
      frenchesco last edited by

      I'm trying to implement a Web Scraper that uses ui.WebView to render a page and then return HTML from that page. The pages can be rendered with JavaScript so I can't use requests or mechanize like I normally would.

      The problem I have at the moment with my current implementation (below) is that I want the main thread to wait until the page has finished rendering and then return the HTML of that page back to the main thread. At the moment the main thread finishes executing before the page has finished loading.

      # coding: utf-8
      import ui
      
      class Scraper (object):
      	def __init__(self, callback, url, js = 'document.documentElement.outerHTML'):
      		self.wv = ui.WebView()
      		self.wv.delegate = self
      		self.wv.load_url(url)
      		self.callback = callback
      		self.js = js
      	
      	def webview_did_finish_load(self, webview):
      		self.callback(webview.eval_js(self.js))
      
      # Example:
      def parse_response(response):
      	print 'Webview finished loading - ' + response
      
      def main():
      	s = Scraper(parse_response, 'https://www.google.com', 'document.title;')
      	# How can I wait for the Web View to finish loading here and return the HTML of the webpage before proceeding on the main thread?
      	print 'Main thread finished executing'
      if __name__ == '__main__':
      	main()
      

      Does anyone know the best way of resolving this issue?

      1 Reply Last reply Reply Quote 0
      • ccc
        ccc last edited by

        http://omz-software.com/pythonista/docs/ios/ui.html#ui.View.wait_modal Will wait until the user closes the view or http://omz-software.com/pythonista/docs/ios/ui.html#ui.View.close is called on the view.

        frenchesco 1 Reply Last reply Reply Quote 0
        • JonB
          JonB last edited by

          A thread synchronization, such as threading.Event, threading.Semaphore, works nicely here. wait_modal requires you to present then close the view.

          import ui
          import threading
          class Scraper (object):
              def __init__(self, callback, url, js = 'document.documentElement.outerHTML'):
                  self.wv = ui.WebView()
                  self.wv.delegate = self
                  self.wv.load_url(url)
                  self.callback = callback
                  self.js = js
                  self.ready_event=threading.Event()
                  
              def webview_did_finish_load(self, webview):
                  self.callback(webview.eval_js(self.js))
                  self.ready_event.set()
          
          # Example:
          def parse_response(response):
              print 'Webview finished loading - ' + response
          
          def main():
              s = Scraper(parse_response, 'https://www.google.com', 'document.title;')
              # Wait until scraper finished loading
              s.ready_event.wait()
              print 'Main thread finished executing'
          if __name__ == '__main__':
              main()
          frenchesco 1 Reply Last reply Reply Quote 2
          • frenchesco
            frenchesco @JonB last edited by frenchesco

            @JonB Thanks! That's exactly what I was looking for. For reference for anyone else that needs to do this in the future. Here's my final code:

            # coding: utf-8
            import ui
            import threading
            
            class Scraper (object):
            	def __init__(self, url, js = 'document.documentElement.outerHTML'):
            		self.wv = ui.WebView()
            		self.wv.delegate = self
            		self.wv.load_url(url)
            		self.js = js
            		self.response = ''
            		self.ready_event = threading.Event()
            		self.ready_event.wait()
            	
            	def webview_did_finish_load(self, webview):
            		self.response = webview.eval_js(self.js)
            		self.ready_event.set()
            
            def main():
            	r = Scraper('https://www.google.com', 'document.title;').response
            	print 'Response: ' + r
            
            if __name__ == '__main__':
            	main()
            

            I basically got rid of the callback and just get the final response when the processing returns to the main thread as that's where I want to continue processing.

            1 Reply Last reply Reply Quote 0
            • frenchesco
              frenchesco @ccc last edited by

              @ccc Thanks for your response. It worked well, except that I couldn't get it to work without displaying the WebView, which I was hoping to avoid. I ended up using the method suggested by @JonB to get around this.

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              Powered by NodeBB Forums | Contributors