Commits on Source (10)
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
-
Nicolas Allègre authored
Showing
- .gitignore 166 additions, 0 deletions.gitignore
- Parser.py 6 additions, 2 deletionsParser.py
- README.md 70 additions, 6 deletionsREADME.md
- create_corpus.py 28 additions, 20 deletionscreate_corpus.py
- create_corpus_before_lang.py 73 additions, 0 deletionscreate_corpus_before_lang.py
- dl_docs.py 29 additions, 24 deletionsdl_docs.py
- parse_docs.py 13 additions, 8 deletionsparse_docs.py
- preprocess.py 12 additions, 11 deletionspreprocess.py
- requirements.txt 1 addition, 0 deletionsrequirements.txt
- utils_loadData.py 32 additions, 0 deletionsutils_loadData.py
.gitignore
0 → 100644
create_corpus_before_lang.py
0 → 100644
... | @@ -9,3 +9,4 @@ numpy | ... | @@ -9,3 +9,4 @@ numpy |
scikit-learn | scikit-learn | ||
bs4 | bs4 | ||
python-magic | python-magic | ||
langdetect | |||
\ No newline at end of file |
utils_loadData.py
0 → 100644