1. unworkable retrying function back online baby New Function: 1. reformatted datetime_transform funtion to handle more month typos 2. reformatted process_article function into 3 functions to use multi-threads better running time 3. renewed article url search technique to handle different volume websites 4. more exception handling 5. bettered keywords and affiliation strip method 6. added methods for processing author data when there exists no author table 7. added code for retry failed processing paper 8. more detailed error messages storage
CST_scrawlCode
用于存储CST小组爬虫代码
Description
Languages
Python
100%