热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

Python自然语言处理工具

Python自然语言处理(NLP)工具汇总NLTK简介:NLTK在使用Python处理自然语言的工具中处于领先的地位。它提供了Word

Python 自然语言处理(NLP)工具汇总




NLTK


  • 简介:

NLTK 在使用 Python 处理自然语言的工具中处于领先的地位。它提供了 WordNet 这种方便处理词汇资源的接口,以及分类、分词、词干提取、标注、语法分析、语义推理等类库。


  • 网站:

Natural Language Toolkit


  • 安装:

安装 NLTK:

[root@master ~]# pip install nltk
Collecting nltkDownloading nltk-3.2.1.tar.gz (1.1MB)100% |████████████████████████████████| 1.1MB 664kB/s
Installing collected packages: nltkRunning setup.py install for nltk ... done
Successfully installed nltk-3.2.1

  • 注意事项:
    安装完以后还要下载nltk语料库才可以使用,下载的是压缩文件,需要解压到nltk_data下面。目录结构如下:

zang@ZANG-PC D:\nltk_data
> ls -al
total 44
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 .
drwxrwx---+ 1 SYSTEM SYSTEM 0 May 30 10:55 ..
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 chunkers
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 corpora
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 grammers
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 help
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 stemmers
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 taggers
drwxrwx---+ 1 Administrators None 0 Oct 25 2015 tokenizers



Pattern


  • 简介:

Pattern是基于web的Python挖掘模块,包含如下工具:
* 数据挖掘:Web服务接口(Google,Twitter,Wikipedia),网络爬虫,HTML DOM 解析。
* 自然语言处理:POS词性标注,n-gram搜索,情感分析,词云。
* 机器学习:向量空间模型(VSM),聚类,分类(KNN,SVM,Perceptron)。
* 网络分析:图中心和可视化。


这里写图片描述


  • 网站:

GitHub主页


  • 安装:

[root@master ~]# pip install pattern
Collecting patternDownloading pattern-2.6.zip (24.6MB)100% |████████████████████████████████| 24.6MB 43kB/s
Installing collected packages: patternRunning setup.py install for pattern ... done
Successfully installed pattern-2.6
[root@master ~]#



TextBlob


  • 简介:

    TextBlob 是基于NLTK和pattern的工具, 有两者的特性。如下:

    • 名词短语提前
    • POS标注
    • 情感分析
    • 分类 (Naive Bayes, Decision Tree)
    • 谷歌翻译
    • 分词和分句
    • 词频和短语频率统计
    • 句法解析
    • n-grams模型
    • 词型转换和词干提取
    • 拼写校正
    • 通过词云整合添加新的语言和模型
  • 网站:

TextBlob: Simplified Text Processing


  • 安装:

[root@master ~]# pip install -U textblob
Collecting textblobDownloading textblob-0.11.1-py2.py3-none-any.whl (634kB)100% |████████████████████████████████| 634kB 1.1MB/s
Requirement already up-to-date: nltk>=3.1 in /usr/lib/python2.7/site-packages (from textblob)
Installing collected packages: textblob
Successfully installed textblob-0.11.1
[root@master ~]# python -m textblob.download_corpora
[nltk_data] Downloading package brown to /root/nltk_data...
[nltk_data] Unzipping corpora/brown.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Unzipping corpora/wordnet.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /root/nltk_data...
[nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package conll2000 to /root/nltk_data...
[nltk_data] Unzipping corpora/conll2000.zip.
[nltk_data] Downloading package movie_reviews to /root/nltk_data...
[nltk_data] Unzipping corpora/movie_reviews.zip.
Finished.



Gensim


  • 简介:

Gensim 是一个 Python 库,用于对大型语料库进行主题建模、文件索引、相似度检索等。它可以处理大于内存的输入数据。作者说它是“纯文本上无监督的语义建模最健壮、高效、易用的软件。”

image


  • 网站:

Gensim HomePage

GitHub - piskvorky/gensim: Topic Modelling for Humans


  • 安装:

[root@master ~]# pip install -U gensim
Collecting gensimDownloading gensim-0.12.4.tar.gz (2.4MB)100% |████████████████████████████████| 2.4MB 358kB/s
Collecting numpy>=1.3 (from gensim)Downloading numpy-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl (15.3MB)100% |████████████████████████████████| 15.3MB 66kB/s
Collecting scipy>=0.7.0 (from gensim)Downloading scipy-0.17.1-cp27-cp27mu-manylinux1_x86_64.whl (39.5MB)100% |████████████████████████████████| 39.5MB 27kB/s
Requirement already up-to-date: six>=1.5.0 in /usr/lib/python2.7/site-packages/six-1.10.0-py2.7.egg (from gensim)
Collecting smart_open>=1.2.1 (from gensim)Downloading smart_open-1.3.3.tar.gz
Collecting boto>=2.32 (from smart_open>=1.2.1->gensim)Downloading boto-2.40.0-py2.py3-none-any.whl (1.3MB)100% |████████████████████████████████| 1.4MB 634kB/s
Requirement already up-to-date: bz2file in /usr/lib/python2.7/site-packages (from smart_open>=1.2.1->gensim)
Collecting requests (from smart_open>=1.2.1->gensim)Downloading requests-2.10.0-py2.py3-none-any.whl (506kB)100% |████████████████████████████████| 512kB 1.4MB/s
Installing collected packages: numpy, scipy, boto, requests, smart-open, gensimFound existing installation: numpy 1.10.1Uninstalling numpy-1.10.1:Successfully uninstalled numpy-1.10.1Found existing installation: scipy 0.12.1DEPRECATION: Uninstalling a distutils installed project (scipy) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.Uninstalling scipy-0.12.1:Successfully uninstalled scipy-0.12.1Found existing installation: boto 2.38.0Uninstalling boto-2.38.0:Successfully uninstalled boto-2.38.0Found existing installation: requests 2.8.1Uninstalling requests-2.8.1:Successfully uninstalled requests-2.8.1Found existing installation: smart-open 1.3.1Uninstalling smart-open-1.3.1:Successfully uninstalled smart-open-1.3.1Running setup.py install for smart-open ... doneFound existing installation: gensim 0.12.3Uninstalling gensim-0.12.3:Successfully uninstalled gensim-0.12.3Running setup.py install for gensim ... done
Successfully installed boto-2.40.0 gensim-0.12.4 numpy-1.11.0 requests-2.6.0 scipy-0.17.1 smart-open-1.3.3



PyNLPI


  • 简介:

它的全称是:Python 自然语言处理库(Python Natural Language Processing Library,音发作: pineapple) 是一个用于自然语言处理任务库。它集合了各种独立或松散互相关的,那些常见的、不常见的、对NLP 任务有用的模块。PyNLPI 可以用来处理 N 元搜索,计算频率表和分布,建立语言模型。它还可以处理向优先队列这种更加复杂的数据结构,或者像 Beam 搜索这种更加复杂的算法。


  • 网站:

Github

PyNLPI HomePage


  • 安装:

从Github上下载源码,解压以后编译安装。

[root@master pynlpl-master]# python setup.py install
Preparing build
running install
running bdist_egg
running egg_info
creating PyNLPl.egg-info
writing requirements to PyNLPl.egg-info/requires.txt
writing PyNLPl.egg-info/PKG-INFO
writing top-level names to PyNLPl.egg-info/top_level.txt
writing dependency_links to PyNLPl.egg-info/dependency_links.txt
writing manifest file 'PyNLPl.egg-info/SOURCES.txt'
reading manifest file 'PyNLPl.egg-info/SOURCES.txt'
writing manifest file 'PyNLPl.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib
creating build/lib/pynlpl
copying pynlpl/tagger.py -> build/lib/pynlpl
......
byte-compiling build/bdist.linux-x86_64/egg/pynlpl/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/pynlpl/mt/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-x86_64/egg/pynlpl/mt/wordalign.py to wordalign.pyc
byte-compiling build/bdist.linux-x86_64/egg/pynlpl/statistics.py to statistics.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying PyNLPl.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying PyNLPl.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying PyNLPl.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying PyNLPl.egg-info/not-zip-safe -> build/bdist.linux-x86_64/egg/EGG-INFO
copying PyNLPl.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying PyNLPl.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
creating dist
creating 'dist/PyNLPl-0.9.2-py2.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing PyNLPl-0.9.2-py2.7.egg
creating /usr/lib/python2.7/site-packages/PyNLPl-0.9.2-py2.7.egg
Extracting PyNLPl-0.9.2-py2.7.egg to /usr/lib/python2.7/site-packages
Adding PyNLPl 0.9.2 to easy-install.pth file

Installed /usr/lib/python2.7/site-packages/PyNLPl-0.9.2-py2.7.egg
Processing dependencies for PyNLPl==0.9.2
Searching for httplib2>=0.6
Reading https://pypi.python.org/simple/httplib2/
Best match: httplib2 0.9.2
Downloading https://pypi.python.org/packages/ff/a9/5751cdf17a70ea89f6dde23ceb1705bfb638fd8cee00f845308bf8d26397/httplib2-0.9.2.tar.gz#md5=bd1b1445b3b2dfa7276b09b1a07b7f0e
Processing httplib2-0.9.2.tar.gz
Writing /tmp/easy_install-G32Vg8/httplib2-0.9.2/setup.cfg
Running httplib2-0.9.2/setup.py -q bdist_egg --dist-dir /tmp/easy_install-G32Vg8/httplib2-0.9.2/egg-dist-tmp-IgKi70
zip_safe flag not set; analyzing archive contents…
httplib2.init: module references file
Adding httplib2 0.9.2 to easy-install.pth file

Installed /usr/lib/python2.7/site-packages/httplib2-0.9.2-py2.7.egg
Searching for numpy==1.11.0
Best match: numpy 1.11.0
Adding numpy 1.11.0 to easy-install.pth file

Using /usr/lib64/python2.7/site-packages
Searching for lxml==3.2.1
Best match: lxml 3.2.1
Adding lxml 3.2.1 to easy-install.pth file

Using /usr/lib64/python2.7/site-packages
Finished processing dependencies for PyNLPl==0.9.2




spaCy


  • 简介:

这是一个商业的开源软件。结合了Python 和Cython 优异的 NLP 工具。是快速的,最先进的自然语言处理工具。


  • 网站:

HomePage

GitHub


  • 安装:

[root@master pynlpl-master]# pip install spacy
Collecting spacyDownloading spacy-0.101.0-cp27-cp27mu-manylinux1_x86_64.whl (5.7MB)100% |████████████████████████████████| 5.7MB 161kB/s
Collecting thinc<5.1.0,>&#61;5.0.0 (from spacy)Downloading thinc-5.0.8-cp27-cp27mu-manylinux1_x86_64.whl (1.4MB)100% |████████████████████████████████| 1.4MB 287kB/s
Collecting murmurhash<0.27,>&#61;0.26 (from spacy)Downloading murmurhash-0.26.4-cp27-cp27mu-manylinux1_x86_64.whl
Collecting cloudpickle (from spacy)Downloading cloudpickle-0.2.1-py2.py3-none-any.whl
Collecting plac (from spacy)Downloading plac-0.9.1.tar.gz (151kB)100% |████████████████████████████████| 153kB 3.2MB/s
Requirement already satisfied (use --upgrade to upgrade): numpy>&#61;1.7 in /usr/lib64/python2.7/site-packages (from spacy)
Requirement already satisfied (use --upgrade to upgrade): six in /usr/lib/python2.7/site-packages/six-1.10.0-py2.7.egg (from spacy)
Collecting cymem<1.32,>&#61;1.30 (from spacy)Downloading cymem-1.31.2-cp27-cp27mu-manylinux1_x86_64.whl (66kB)100% |████████████████████████████████| 71kB 4.3MB/s
Collecting preshed<0.47,>&#61;0.46.1 (from spacy)Downloading preshed-0.46.4-cp27-cp27mu-manylinux1_x86_64.whl (223kB)100% |████████████████████████████████| 225kB 2.4MB/s
Collecting sputnik<0.10.0,>&#61;0.9.2 (from spacy)Downloading sputnik-0.9.3-py2.py3-none-any.whl
Collecting semver (from sputnik<0.10.0,>&#61;0.9.2->spacy)Downloading semver-2.5.0.tar.gz
Installing collected packages: murmurhash, cymem, preshed, thinc, cloudpickle, plac, semver, sputnik, spacyRunning setup.py install for plac ... doneRunning setup.py install for semver ... done
Successfully installed cloudpickle-0.2.1 cymem-1.31.2 murmurhash-0.26.4 plac-0.9.1 preshed-0.46.4 semver-2.5.0 spacy-0.101.0 sputnik-0.9.3 thinc-5.0.8



Polyglot


  • 简介&#xff1a;

Polyglot 支持大规模多语言应用程序的处理。它支持165种语言的分词&#xff0c;196中语言的辨识&#xff0c;40种语言的专有名词识别&#xff0c;16种语言的词性标注&#xff0c;136种语言的情感分析&#xff0c;137种语言的嵌入&#xff0c;135种语言的形态分析&#xff0c;以及69种语言的翻译。特性如下&#xff1a;
Tokenization (165 Languages)
Language detection (196 Languages)
Named Entity Recognition (40 Languages)
Part of Speech Tagging (16 Languages)
Sentiment Analysis (136 Languages)
Word Embeddings (137 Languages)
Morphological analysis (135 Languages)
Transliteration (69 Languages)



  • 网站&#xff1a;

Github


  • 安装&#xff1a;

[root&#64;master pynlpl-master]# pip install polyglot
Collecting polyglotDownloading polyglot-15.10.03-py2.py3-none-any.whl (54kB)100% |████████████████████████████████| 61kB 153kB/s
Collecting pycld2>&#61;0.3 (from polyglot)Downloading pycld2-0.31.tar.gz (14.3MB)100% |████████████████████████████████| 14.3MB 71kB/s
Collecting wheel>&#61;0.23.0 (from polyglot)Downloading wheel-0.29.0-py2.py3-none-any.whl (66kB)100% |████████████████████████████████| 71kB 4.2MB/s
Collecting futures>&#61;2.1.6 (from polyglot)Downloading futures-3.0.5-py2-none-any.whl
Requirement already satisfied (use --upgrade to upgrade): six>&#61;1.7.3 in /usr/lib/python2.7/site-packages/six-1.10.0-py2.7.egg (from polyglot)
Collecting PyICU>&#61;1.8 (from polyglot)Downloading PyICU-1.9.3.tar.gz (179kB)100% |████████████████████████████████| 184kB 2.9MB/s
Collecting morfessor>&#61;2.0.2a1 (from polyglot)Downloading Morfessor-2.0.2alpha3.tar.gz
Installing collected packages: pycld2, wheel, futures, PyICU, morfessor, polyglotRunning setup.py install for pycld2 ... doneRunning setup.py install for PyICU ... doneRunning setup.py install for morfessor ... done
Successfully installed PyICU-1.9.3 futures-3.0.5 morfessor-2.0.2a3 polyglot-15.10.3 pycld2-0.31 wheel-0.29.0



MontyLingua


  • 简介&#xff1a;

MontyLingua 是一个免费的、功能强大的、端到端的英文处理工具。在 MontyLingua 输入原始英文文本 &#xff0c;输出就会得到这段文本的语义解释。它适用于信息检索和提取&#xff0c;请求处理&#xff0c;问答系统。从英文文本中&#xff0c;它能提取出主动宾元组&#xff0c;形容词、名词和动词短语&#xff0c;人名、地名、事件&#xff0c;日期和时间等语义信息。


  • 网站&#xff1a;

HomePage

Github


  • 安装&#xff1a;


Usage
Webservice
python server.py
The webservice runs on port 8001 at /service by default. For parameters etc see the NIF spec.
Therefore you can curl your query like this
curl “http://localhost:8001/service?nif&#61;true&input-type&#61;text&input&#61;This%20is%20a%20city%20called%20Berlin.”
or simply use your browser to query the target.
Console
python nif.py
But this method is mainly for debugging purposes and supports only hardcoded options.





BLLIP Parser


  • 简介&#xff1a;

BLLIP Parser&#xff08;也叫做 Charniak-Johnson parser&#xff09;是一个集成了生成成分分析器和最大熵排序的统计自然语言分析器。它包括命令行和python接口。


  • 网站&#xff1a;

GitHub

HomePage


  • 安装&#xff1a;

[root&#64;master pynlpl-master]# pip install --user bllipparser
Collecting bllipparserDownloading bllipparser-2015.12.3.tar.gz (548kB)100% |████████████████████████████████| 552kB 1.2MB/s
Requirement already satisfied (use --upgrade to upgrade): six in /usr/lib/python2.7/site-packages/six-1.10.0-py2.7.egg (from bllipparser)
Building wheels for collected packages: bllipparserRunning setup.py bdist_wheel for bllipparser ... doneStored in directory: /root/.cache/pip/wheels/6f/7a/d8/037a4aa0fa275f43e1129008eb7834dc8522ef158d2e96534b
Successfully built bllipparser
Installing collected packages: bllipparser
Successfully installed bllipparser



Quepy


  • 简介&#xff1a;

Quepy 是一个 Python 框架&#xff0c;提供了将自然语言问题转换成为数据库查询语言中的查询。它可以方便地自定义自然语言中不同类型的问题和数据库查询。所以&#xff0c;通过 Quepy&#xff0c;仅仅修改几行代码&#xff0c;就可以构建你自己的自然语言查询数据库系统。


  • 网站&#xff1a;

GitHub - machinalis/quepy: A python framework to transform natural language questions to queries in a database query language.
Quepy: A Python framework to transform natural language questions to queries.


  • 安装

[root&#64;master pynlpl-master]# pip install quepy
Collecting quepyDownloading quepy-0.2.tar.gz (42kB)100% |████████████████████████████████| 51kB 128kB/s
Collecting refo (from quepy)Downloading REfO-0.13.tar.gz
Requirement already satisfied (use --upgrade to upgrade): nltk in /usr/lib/python2.7/site-packages (from quepy)
Collecting SPARQLWrapper (from quepy)Downloading SPARQLWrapper-1.7.6.zip
Collecting rdflib>&#61;4.0 (from SPARQLWrapper->quepy)Downloading rdflib-4.2.1.tar.gz (889kB)100% |████████████████████████████████| 890kB 823kB/s
Collecting keepalive>&#61;0.5 (from SPARQLWrapper->quepy)Downloading keepalive-0.5.zip
Collecting isodate (from rdflib>&#61;4.0->SPARQLWrapper->quepy)Downloading isodate-0.5.4.tar.gz
Requirement already satisfied (use --upgrade to upgrade): pyparsing in /usr/lib/python2.7/site-packages (from rdflib>&#61;4.0->SPARQLWrapper->quepy)
Collecting html5lib (from rdflib>&#61;4.0->SPARQLWrapper->quepy)Downloading html5lib-0.9999999.tar.gz (889kB)100% |████████████████████████████████| 890kB 854kB/s
Requirement already satisfied (use --upgrade to upgrade): six in /usr/lib/python2.7/site-packages/six-1.10.0-py2.7.egg (from html5lib->rdflib>&#61;4.0->SPARQLWrapper->quepy)
Building wheels for collected packages: quepy, refo, SPARQLWrapper, rdflib, keepalive, isodate, html5libRunning setup.py bdist_wheel for quepy ... doneStored in directory: /root/.cache/pip/wheels/c8/04/bf/495b88a68aa5c1e9dd1629b09ab70261651cf517d1b1c27464Running setup.py bdist_wheel for refo ... doneStored in directory: /root/.cache/pip/wheels/76/97/81/825976cf0a2b9ad759bbec13a649264938dffb52dfd56ac6c8Running setup.py bdist_wheel for SPARQLWrapper ... doneStored in directory: /root/.cache/pip/wheels/50/fe/25/be6e98daa4f576494df2a18d5e86a182e3d7e0735d062cc984Running setup.py bdist_wheel for rdflib ... doneStored in directory: /root/.cache/pip/wheels/fb/93/10/4f8a3e95937d8db410a490fa235bd95e0e0d41b5f6274b20e5Running setup.py bdist_wheel for keepalive ... doneStored in directory: /root/.cache/pip/wheels/16/4f/c1/121ddff67b131a371b66d682feefac055fbdbb9569bfde5c51Running setup.py bdist_wheel for isodate ... doneStored in directory: /root/.cache/pip/wheels/61/c0/d2/6b4a10c222ba9261ab9872a8f05d471652962284e8c677e5e7Running setup.py bdist_wheel for html5lib ... doneStored in directory: /root/.cache/pip/wheels/6f/85/6c/56b8e1292c6214c4eb73b9dda50f53e8e977bf65989373c962
Successfully built quepy refo SPARQLWrapper rdflib keepalive isodate html5lib
Installing collected packages: refo, isodate, html5lib, rdflib, keepalive, SPARQLWrapper, quepy
Successfully installed SPARQLWrapper-1.7.6 html5lib-0.9999999 isodate-0.5.4 keepalive-0.5 quepy-0.2 rdflib-4.2.1 refo-0.13



MBSP


  • 简介&#xff1a;

MBSP is a text analysis system based on the TiMBL and MBT memory based learning applications developed at CLiPS and ILK. It provides tools for Tokenization and Sentence Splitting, Part of Speech Tagging, Chunking, Lemmatization, Relation Finding and Prepositional Phrase Attachment.
The general English version of MBSP has been trained on data from the Wall Street Journal corpus.


这里写图片描述


  • 网站&#xff1a;

HomePage

Github


  • 安装&#xff1a;

下载&#xff0c;解压&#xff0c;编译安装&#xff1a;

[root&#64;master MBSP]# python setup.py install
.....编译的信息.....
.....2分钟左右.....



参考&#xff1a;


  • 李岩知乎回答&#xff1a;目前常用的自然语言处理开源项目/开发包有哪些&#xff1f;
  • 数盟&#xff1a;用Python做自然语言处理必知的八个工具

推荐阅读
  • 如何实现织梦DedeCms全站伪静态
    本文介绍了如何通过修改织梦DedeCms源代码来实现全站伪静态,以提高管理和SEO效果。全站伪静态可以避免重复URL的问题,同时通过使用mod_rewrite伪静态模块和.htaccess正则表达式,可以更好地适应搜索引擎的需求。文章还提到了一些相关的技术和工具,如Ubuntu、qt编程、tomcat端口、爬虫、php request根目录等。 ... [详细]
  • Nginx使用AWStats日志分析的步骤及注意事项
    本文介绍了在Centos7操作系统上使用Nginx和AWStats进行日志分析的步骤和注意事项。通过AWStats可以统计网站的访问量、IP地址、操作系统、浏览器等信息,并提供精确到每月、每日、每小时的数据。在部署AWStats之前需要确认服务器上已经安装了Perl环境,并进行DNS解析。 ... [详细]
  • 生成式对抗网络模型综述摘要生成式对抗网络模型(GAN)是基于深度学习的一种强大的生成模型,可以应用于计算机视觉、自然语言处理、半监督学习等重要领域。生成式对抗网络 ... [详细]
  • 近年来,大数据成为互联网世界的新宠儿,被列入阿里巴巴、谷歌等公司的战略规划中,也在政府报告中频繁提及。据《大数据人才报告》显示,目前全国大数据人才仅46万,未来3-5年将出现高达150万的人才缺口。根据领英报告,数据剖析人才供应指数最低,且跳槽速度最快。中国商业结合会数据剖析专业委员会统计显示,未来中国基础性数据剖析人才缺口将高达1400万。目前BAT企业中,60%以上的招聘职位都是针对大数据人才的。 ... [详细]
  • [译]技术公司十年经验的职场生涯回顾
    本文是一位在技术公司工作十年的职场人士对自己职业生涯的总结回顾。她的职业规划与众不同,令人深思又有趣。其中涉及到的内容有机器学习、创新创业以及引用了女性主义者在TED演讲中的部分讲义。文章表达了对职业生涯的愿望和希望,认为人类有能力不断改善自己。 ... [详细]
  • 本文讨论了在手机移动端如何使用HTML5和JavaScript实现视频上传并压缩视频质量,或者降低手机摄像头拍摄质量的问题。作者指出HTML5和JavaScript无法直接压缩视频,只能通过将视频传送到服务器端由后端进行压缩。对于控制相机拍摄质量,只有使用JAVA编写Android客户端才能实现压缩。此外,作者还解释了在交作业时使用zip格式压缩包导致CSS文件和图片音乐丢失的原因,并提供了解决方法。最后,作者还介绍了一个用于处理图片的类,可以实现图片剪裁处理和生成缩略图的功能。 ... [详细]
  • 【shell】网络处理:判断IP是否在网段、两个ip是否同网段、IP地址范围、网段包含关系
    本文介绍了使用shell脚本判断IP是否在同一网段、判断IP地址是否在某个范围内、计算IP地址范围、判断网段之间的包含关系的方法和原理。通过对IP和掩码进行与计算,可以判断两个IP是否在同一网段。同时,还提供了一段用于验证IP地址的正则表达式和判断特殊IP地址的方法。 ... [详细]
  • 一句话解决高并发的核心原则
    本文介绍了解决高并发的核心原则,即将用户访问请求尽量往前推,避免访问CDN、静态服务器、动态服务器、数据库和存储,从而实现高性能、高并发、高可扩展的网站架构。同时提到了Google的成功案例,以及适用于千万级别PV站和亿级PV网站的架构层次。 ... [详细]
  • 如何使用代理服务器进行网页抓取?
    本文介绍了如何使用代理服务器进行网页抓取,并探讨了数据驱动对竞争优势的重要性。通过网页抓取,企业可以快速获取并分析大量与需求相关的数据,从而制定营销战略。同时,网页抓取还可以帮助电子商务公司在竞争对手的网站上下载数百页的有用数据,提高销售增长和毛利率。 ... [详细]
  • 背景应用安全领域,各类攻击长久以来都危害着互联网上的应用,在web应用安全风险中,各类注入、跨站等攻击仍然占据着较前的位置。WAF(Web应用防火墙)正是为防御和阻断这类攻击而存在 ... [详细]
  • 本文介绍了OkHttp3的基本使用和特性,包括支持HTTP/2、连接池、GZIP压缩、缓存等功能。同时还提到了OkHttp3的适用平台和源码阅读计划。文章还介绍了OkHttp3的请求/响应API的设计和使用方式,包括阻塞式的同步请求和带回调的异步请求。 ... [详细]
  • ORACLE空间管理实验5:块管理之ASSM下高水位的影响
    数据库|mysql教程ORACLE,空间,管理,实验,ASSM,下高,水位,影响,数据库-mysql教程易语言黑客软件源码,vscode左侧搜索,ubuntu怎么看上一页,ecs搭 ... [详细]
  • 计算机网络计算机网络分层结构
    为了解决计算机网络复杂的问题,提出了计算机网络分层结构。计算机网络分层结构主要有OSI7层参考模型,TCPIP4层参考模型两种。为什么要分层不同产商 ... [详细]
  • 这个问题困扰了我两天,卸载Dr.COM客户端(我们学校上网要装这个客户端登陆服务器,以后只能在网页里输入用户名和密码了),问题解决了。问题的现象:在实验室机台式机上安装openfire和sp ... [详细]
  • 篇首语:本文由编程笔记#小编为大家整理,主要介绍了常用#免费%代理IP库&整理*收藏——实时@更新(大概)相关的知识,希望对你有一定的参考价值。 ... [详细]
author-avatar
加勒比海盗530
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有