����JFIF���������
1#@!#!123s
D7net
Home
Console
Upload
information
Create File
Create Folder
About
Tools
:
/
opt
/
alt
/
python33
/
lib64
/
python3.3
/
urllib
/
__pycache__
/
Filename :
robotparser.cpython-33.pyc
back
Copy
� ��f���c���������������@���sd���d��Z��d�d�l�Z�d�d�l�Z�d�g�Z�Gd�d����d���Z�Gd�d����d���Z�Gd�d����d���Z�d�S( ���u<�� robotparser.py Copyright (C) 2000 Bastian Kleineidam You can choose between two licenses when using this package: 1) GNU GPLv2 2) PSF license for Python 2.2 The robots.txt Exclusion Protocol is implemented as specified in http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html i����Nu���RobotFileParserc�������������B���s����|��Ee��Z�d��Z�d�Z�d�d�d���Z�d�d����Z�d�d����Z�d �d ����Z�d�d����Z�d �d����Z �d�d����Z �d�d����Z�d�d����Z�d�S(���u���RobotFileParserus��� This class provides a set of methods to read, parse and answer questions about a single robots.txt file. u����c�������������C���s>���g��|��_��d��|��_�d�|��_�d�|��_�|��j�|���d�|��_�d��S(���Ni����F(���u���entriesu���Noneu ���default_entryu���Falseu���disallow_allu ���allow_allu���set_urlu���last_checked(���u���selfu���url(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���__init__���s���� u���RobotFileParser.__init__c�������������C���s���|��j��S(���u����Returns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically. (���u���last_checked(���u���self(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���mtime���s����u���RobotFileParser.mtimec�������������C���s���d�d�l��}�|�j�����|��_�d�S(���uY���Sets the time the robots.txt file was last fetched to the current time. i����N(���u���timeu���last_checked(���u���selfu���time(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���modified(���s����u���RobotFileParser.modifiedc�������������C���s5���|�|��_��t�j�j�|���d�d���\�|��_�|��_�d�S(���u,���Sets the URL referring to a robots.txt file.i���i���N(���u���urlu���urllibu���parseu���urlparseu���hostu���path(���u���selfu���url(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���set_url0���s���� u���RobotFileParser.set_urlc�������������C���s����y�t��j�j�|��j���}�Wna�t��j�j�k �r|�}�z;�|�j�d�k�rO�d�|��_�n�|�j�d�k�rj�d�|��_ �n��WYd�d�}�~�Xn)�X|�j ����}�|��j�|�j�d���j ������d�S(���u4���Reads the robots.txt URL and feeds it to the parser.i���i���i���Nu���utf-8(���i���i���T(���u���urllibu���requestu���urlopenu���urlu���erroru ���HTTPErroru���codeu���Trueu���disallow_allu ���allow_allu���readu���parseu���decodeu ���splitlines(���u���selfu���fu���erru���raw(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���read5���s����u���RobotFileParser.readc�������������C���sA���d�|�j��k�r-�|��j�d��k�r=�|�|��_�q=�n�|��j�j�|���d��S(���Nu���*(���u ���useragentsu ���default_entryu���Noneu���entriesu���append(���u���selfu���entry(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu ���_add_entryB���s����u���RobotFileParser._add_entryc�������������C���s��d�}�t�����}�x�|�D]�}�|�sn�|�d�k�r@�t�����}�d�}�qn�|�d�k�rn�|��j�|���t�����}�d�}�qn�n��|�j�d���}�|�d�k�r��|�d�|���}�n��|�j����}�|�s��q�n��|�j�d�d���}�t�|���d�k�r�|�d�j����j����|�d�<t�j�j �|�d�j������|�d�<|�d�d�k�rd|�d�k�rG|��j�|���t�����}�n��|�j �j�|�d���d�}�q�|�d�d�k�r�|�d�k�r�|�j�j�t �|�d�d �����d�}�q�q�|�d�d �k�r�|�d�k�r�|�j�j�t �|�d�d�����d�}�q�q�q�q�W|�d�k�r|��j�|���n��d�S(���u����Parse the input lines from a robots.txt file. We allow that a user-agent: line is not preceded by one or more blank lines. i����i���i���u���#Nu���:u ���user-agentu���disallowu���allowFT(���u���Entryu ���_add_entryu���findu���stripu���splitu���lenu���loweru���urllibu���parseu���unquoteu ���useragentsu���appendu ���rulelinesu���RuleLineu���Falseu���True(���u���selfu���linesu���stateu���entryu���lineu���i(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���parseK���sJ���� u���RobotFileParser.parsec�������������C���s����|��j��r �d�S|��j�r�d�St�j�j�t�j�j�|�����}�t�j�j�d�d�|�j �|�j �|�j�|�j�f���}�t�j�j �|���}�|�s��d�}�n��x-�|��j�D]"�}�|�j�|���r��|�j�|���Sq��W|��j�r��|��j�j�|���Sd�S(���u=���using the parsed robots.txt decide if useragent can fetch urlu����u���/FT(���u���disallow_allu���Falseu ���allow_allu���Trueu���urllibu���parseu���urlparseu���unquoteu ���urlunparseu���pathu���paramsu���queryu���fragmentu���quoteu���entriesu ���applies_tou ���allowanceu ���default_entry(���u���selfu ���useragentu���urlu ���parsed_urlu���entry(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu ���can_fetch~���s ���� u���RobotFileParser.can_fetchc�������������C���s���d�j��d�d����|��j�D����S(���Nu����c�������������S���s ���g��|��]�}�t��|���d����q�S(���u��� (���u���str(���u���.0u���entry(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu ���<listcomp>����s��� �u+���RobotFileParser.__str__.<locals>.<listcomp>(���u���joinu���entries(���u���self(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���__str__����s����u���RobotFileParser.__str__N( ���u���__name__u ���__module__u���__qualname__u���__doc__u���__init__u���mtimeu���modifiedu���set_urlu���readu ���_add_entryu���parseu ���can_fetchu���__str__(���u ���__locals__(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���RobotFileParser���s��� 3c�������������B���s>���|��Ee��Z�d��Z�d�Z�d�d����Z�d�d����Z�d�d����Z�d�S( ���u���RuleLineuo���A rule line is a single "Allow:" (allowance==True) or "Disallow:" (allowance==False) followed by a path.c�������������C���s\���|�d�k�r�|�r�d�}�n��t�j�j�t�j�j�|�����}�t�j�j�|���|��_�|�|��_�d��S(���Nu����T(���u���Trueu���urllibu���parseu ���urlunparseu���urlparseu���quoteu���pathu ���allowance(���u���selfu���pathu ���allowance(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���__init__����s ���� u���RuleLine.__init__c�������������C���s���|��j��d�k�p�|�j�|��j����S(���Nu���*(���u���pathu ���startswith(���u���selfu���filename(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu ���applies_to����s����u���RuleLine.applies_toc�������������C���s���|��j��r�d�p�d�d�|��j�S(���Nu���Allowu���Disallowu���: (���u ���allowanceu���path(���u���self(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���__str__����s����u���RuleLine.__str__N(���u���__name__u ���__module__u���__qualname__u���__doc__u���__init__u ���applies_tou���__str__(���u ���__locals__(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���RuleLine����s���u���RuleLinec�������������B���sJ���|��Ee��Z�d��Z�d�Z�d�d����Z�d�d����Z�d�d����Z�d�d ����Z�d �S(���u���Entryu?���An entry has one or more user-agents and zero or more rulelinesc�������������C���s���g��|��_��g��|��_�d��S(���N(���u ���useragentsu ���rulelines(���u���self(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���__init__����s���� u���Entry.__init__c�������������C���sj���g��}�x'�|��j��D]�}�|�j�d�|�d�g���q�Wx*�|��j�D]�}�|�j�t�|���d�g���q:�Wd�j�|���S(���Nu���User-agent: u��� u����(���u ���useragentsu���extendu ���rulelinesu���stru���join(���u���selfu���retu���agentu���line(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���__str__����s����u ���Entry.__str__c�������������C���s]���|�j��d���d�j����}�x=�|��j�D]2�}�|�d�k�r9�d�S|�j����}�|�|�k�r#�d�Sq#�Wd�S(���u2���check if this entry applies to the specified agentu���/i����u���*TF(���u���splitu���loweru ���useragentsu���Trueu���False(���u���selfu ���useragentu���agent(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu ���applies_to����s����u���Entry.applies_toc�������������C���s.���x'�|��j��D]�}�|�j�|���r �|�j�Sq �Wd�S(���uZ���Preconditions: - our agent applies to this entry - filename is URL decodedT(���u ���rulelinesu ���applies_tou ���allowanceu���True(���u���selfu���filenameu���line(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu ���allowance����s����u���Entry.allowanceN(���u���__name__u ���__module__u���__qualname__u���__doc__u���__init__u���__str__u ���applies_tou ���allowance(���u ���__locals__(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���Entry����s ��� u���Entry(���u���__doc__u���urllib.parseu���urllibu���urllib.requestu���__all__u���RobotFileParseru���RuleLineu���Entry(����(����(����u7���/opt/alt/python33/lib64/python3.3/urllib/robotparser.pyu���<module>���s ��� �