str = “puts \”a\”"
0.upto(25) {
eval(str)
str.next!
}
This blog is about scraping with ruby… Seriously…
I just want to share this great article.
http://www.ibm.com/developerworks/linux/library/l-spider/
The code is pretty much self-explanatory. If you have questions don’t hesitate to write them.
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
page = agent.get 'http://images.google.com/'
form = page.forms[0]
form['q']='obama' #type here what you wanna get
page = agent.submit form
uri = page.links[18].uri.to_s #gets the url
uri = uri[15..uri.index('&')-1]
agent.get(uri).save_as((File.basename(uri)).split("?")[0]) # saves the picture to script's directory
puts 'done'
I recently came across this gmail scraping tutorial and got inspired to try it myself. Here is my modified version of the code:
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
page = agent.get 'http://gmail.com'
form = page.forms[0]
form['Email'] = '***email***'
form['Passwd'] = '***password***'
page = agent.submit form
page = agent.get page.meta[0].href #redirection
puts page.links[15]
puts page.links[20]
puts page.links[27..37]
Tk widget code:
require 'tk'
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
page = agent.get 'http://gmail.com'
form = page.forms[0]
form['Email'] = '***email***'
form['Passwd'] = '***password***'
page = agent.submit form
page = agent.get page.meta[0].href #redirection
string = page.links[15].to_s + page.links[20].to_s + page.links[27..37].to_s
root = TkRoot.new # tk start
text = TkText.new(root) { width 100; height 100; font TkFont.new('helvetica 13 bold') }.pack("side"=>"left")
text.insert('end', string)
Tk.mainloop # end tk
I like doing everything in the Linux Terminal. That is why i made a simple script that enables me to twit directly from my computer without opening Mozilla and logging into Twitter. Here it is:
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
page = agent.get 'http://twitter.com/login'
form = page.forms[1]
form["session[username_or_email]"] = '***type your username or email here***'
form["session[password]"] = '***type your password here***'
page = agent.submit form
form = page.forms[1]
puts "Logged in"
form["status"] = gets.chomp
page = agent.submit form
The only problem is that you don’t know the exact number of available symbols remaining. This is the final code for the simple twitting widget i made. I used Ruby/Tk, because it is very easy to learn, yet powerful. It counts symbols
require 'tk'
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new # login start
page = agent.get 'http://twitter.com/login'
form = page.forms[1]
form["session[username_or_email]"] = '***type your username or email here***'
form["session[password]"] = '***type your password here***'
page = agent.submit form
form = page.forms[1] # login end
root = TkRoot.new # tk start
@text = TkVariable.new
event = TkEntry.new(root, 'textvariable' => @text) do
width 130
font TkFont.new('helvetica 10') #this is optional ![]()
pack
end
lbl = TkLabel.new(root) do
text '160'
pack
end
event.bind("KeyPress") {lbl.configure('text'=> (160-@text.value.length).to_s ) }
update = TkButton.new(root) do
text "Update"
pack
end
update.command do
form["status"] = @text.value
page = agent.submit form
end
TkButton.new(root) do
text "Cancel"
command { exit }
pack
end
Tk.mainloop # end tk
Have fun! ^_^