 |
|
Grigori SIDOROV
PhD, Professor and researcher
(Full Professor)
Natural Language and Text Processing Laboratory
Centro de Investigación en Computación (Center for Computer Research
CIC)
Instituto Politécnico Nacional (National Polytechnic Institute,
IPN),
Mexico City, Mexico
Regular member of Mexican Academy
of Sciences
National Researcher of Mexico (SNI), level 3 (highest)
Editor-in-Chief of the research journal
Computación_y_sistemas (in
index of excellence of Conacyt)
Phone: +(52)-55-57296000 ext.
56518, 56544
Mobile: +(52-1)-55-91887293
email:
|
Curriculum Vitae
Text processing techniques and systems, automatic dictionary processing,
automatic morphological analysis of different languages, automatic syntactic
analysis, anaphora resolution, word sense disambiguation, corpus linguistics,
parallel texts, linguistic software development.
Current projects:
Linguistic
tools; parallel texts;
automatic analysis of explanatory dictionaries, sentiment analysis.
License agreement:
-
You can use all these programs freely for
academic purposes. No warranty.
- You should inform us about the usage of the programs, and
- 3. You should cite the corresponding papers in your publications
obtained with the help of these programs.
- Downloading means that you accept the license. Thank you.
English-Spanish dictionary of weighted
morphological forms. Forms are weighted according to the distributions of
corresponding grammar classes in corpora. Unicode. Spanish-English version is
available on request. For example:
'cause porque 1.0000000
'til hasta 1.0000000
a un 0.4603677
a una 0.3662918
a unas 0.0734382
a uno 0.0031157
a unos 0.0967866
abaci ábaco 0.0561639
abaci ábacos 0.9438361
abacus ábaco 0.9890721
abacus ábacos 0.0109279
abacuses ábaco 0.0561639
abacuses ábacos 0.9438361
abandon abandonábamos 0.0024804
abandon abandonáis 0.0005694
abandon abandonáramos 0.0004860
abandon abandonáremos 0.0007113
abandon abandonásemos 0.0004860...
...abandon abandonaba 0.0779384
abandon abandonabais 0.0000805
abandon abandonaban 0.0226584...
Paper for citing for English-Spanish dictionary of weighted morphological
forms:
Grigori Sidorov, Alberto Barrón-Cedeño and Paolo Rosso.
English-Spanish Large
Statistical Dictionary of Inflectional Forms. In: Proceedings of the Seventh
International Conference on Language Resources and Evaluation (LREC'10),
Valletta, Malta. European Language Resources Association (ELRA), 2010, pp.
277-281.
Interface for the system for fast search of Maya glyphs
based on their visual structural description (ZIP) or
Compressed as EXE file.
Beta-version. The system uses the dictionary of J. Montgomery.
EXE: Download the Glyphs.exe file, execute it, the files will be copied to
the folder you choose. Then execute the file SETUP.EXE.
ZIP: Download the Glyphs.zip file, unzip files to the folder you choose .
Then execute the file SETUP.EXE.
Papers for citing for glyph search system:
1. Obdulia Pichardo Lagunas, Grigori Sidorov.
Diccionario de los glifos maya con descripción visual
estructural. In: Proc. of International Conference EURALEX-2008, Barcelona,
Spain, July 2008, pp 747-751.
2. Grigori Sidorov, Obdulia Pichardo-Lagunas, and Liliana Chanona-Hernandez.
Search Interface to a Mayan Glyph Database based on
Visual Characteristics. Lecture Notes in Computer Science, Vol. 5723,
Springer-Verlag, 2009, pp. 222-229
System for automatic morphological analysis of Spanish NEW:
A complete wordlist (beta-version) generated with this system is available.
System for automatic morphological analysis of Russian
These are EXE files for Windows; DLLs are available on request.
These are the programs that perform lemmatization and provide grammar
information of each word form of Spanish or Russian correspondingly.
See detailed description on the corresponding pages – follow the links.
Paper for citing for morphological analysis systems:
A. Gelbukh, G. Sidorov.
Approach to construction of
automatic morphological analysis systems for inflective languages with little
effort. In: Computational Linguistics and Intelligent Text Processing (CICLing-2003),
Lecture Notes in Computer Science, N 2588, Springer-Verlag, 2003, pp. 215–220.
Parser with Spanish grammar (EXE and DLL for
Windows).
This is a chart parser that uses a CF grammar with elements of
unification. Experimental CF grammar for Spanish is provided along with
tools for its modifications.
Paper for citing:
A. Gelbukh, G. Sidorov, S. Galicia Haro, I. Bolshakov.
Environment for Development of a Natural Language Syntactic Analyzer. In: Acta
Academia 2002, Moldova, 2002, pp.206-213
More than 170 scientific publications, 1 patent.
More than 300 references to my works (without self-citing), h-index 10.
-
“Lomonosov” Moscow National University, 1996
Candidate of Philological sciences (Ph.D.) (Structural, applied and
mathematical linguistics)
Thesis: “Design and implementation of linguistic models, algorithms, and
data for the systems with morphological analysis and generation for Russian
language”;
-
“Lomonosov” Moscow National University, 1983-1988
(M.C. & B.C.) Philological Faculty, Department of Structural and Applied
Linguistics;
- Gelbukh, G. Sidorov, A. Guzman-Arenas. Use of a weighted topic hierarchy
for text retrieval and classification. In Václav Matoušek et al. (Eds.).
Text, Speech and Dialogue. Proc. 2nd International Workshop TSD-99, Plzen,
Czech Republic, September 13-17, 1999. Lecture Notes in Artificial
Intelligence, No. 1692, Springer, pp. 130–135.
- Gelbukh, G. Sidorov, and A. Guzmán-Arenas. A
Method of Describing Document Contents through Topic Selection. Proc.
SPIRE’99, International Symposium on String Processing and Information
Retrieval, Cancun, Mexico, September 22 – 24. IEEE Computer Society Press,
1999, pp. 73-80.
- Alexander F. Gelbukh and Grigori Sidorov.
On Indirect Anaphora
Resolution. Proc. PACLING-99, Pacific Association for Computational
Linguistics, ISBN 0-9685753-0-7, University of Waterloo, Waterloo, Ontario,
Canada, August 25-28, 1999, pp. 181-190.
- Grigori Sidorov, Alexander Gelbukh.
Demonstrative pronouns as markers of indirect anaphora. In: Proc. 2nd
International Conference on Cognitive Science and 16th Annual Meeting of the
Japanese Cognitive Science Society Joint Conference (ICCS/JCSS99), July
27-30, 1999, Tokyo, Japan, pp. 418-423.
- Alexander Gelbukh and Grigori Sidorov.
Approach to construction of
automatic morphological analysis systems for inflective languages with
little effort. Lecture Notes in Computer Science, N 2588,
Springer-Verlag, 2003, pp. 215–220.
- Alexander Gelbukh, Grigori Sidorov, and Liliana Chanona-Hernández.
Compilation of a Spanish representative corpus. Lecture Notes in Computer
Science N 2276, Springer-Verlag, 2002, pp. 285–288.
- Alexander Gelbukh and Grigori Sidorov.
Automatic Selection
of Defining Vocabulary in an Explanatory Dictionary. Proc. CICLing-2002,
Conference on Intelligent Text Processing and Computational Linguistics,
February 16–23, 2002, Mexico City. Lecture Notes in Computer Science N 2276,
Springer-Verlag, pp. 300–303.
- Alexander Gelbukh, Grigori Sidorov, San-Yong Han, and Erika
Hernández-Rubio. Automatic Enrichment of Very Large
Dictionary of Word Combinations on the Basis of Dependency Formalism.
Lecture Notes in Artificial Intelligence N 2972, 2004, ISSN 0302-9743,
Springer-Verlag, pp 430-437. (discussion of collocation concept).
- Alexander Gelbukh and Grigori Sidorov.
Alignment of Paragraphs in Bilingual Texts using Bilingual Dictionaries and
Dynamic Programming. Lecture Notes in Computer Science, N 4225, ISSN
0302-9743, Springer-Verlag, 2006, pp 824-833. (methods of alignment of
parallel texts)
- Gaspár Ramírez, James L. Fidelholtz, Héctor Jiménez, Grigori Sidorov.
Elaboración de un diccionario de verbos del español a
partir de una lexicografía sistemática. In: “Avances en la Ciencia de la
computación”, Proc. of 7 Int .Conf. ENC-2006, San Luís Potosi, México, 2006,
pp.270-275.
- Alexander Gelbukh, Grigori Sidorov, SangYong Han.
On Some Optimization Heuristics
for Lesk-Like WSD Algorithms. Lecture Notes in Computer Science, N 3513,
ISSN 0302-9743, Springer-Verlag, 2005, pp. 402–405.
- Alexander Gelbukh and Grigori Sidorov. Zipf
and Heaps Laws’ Coefficients Depend on Language. Lecture Notes in
Computer Science N 2004, 2001, ISSN 0302-9743, Springer-Verlag, pp. 330–333.
- Castro-Sánchez, N. A., Sidorov, G. Automatic
Acquisition of Synonyms of Verbs from an Explanatory Dictionary using
Hyponym and Hyperonym Relations. Lecture Notes in Computer Science, vol.
6718, 2011.
- María de los Ángeles Alonso-Lavernia, Argelio Víctor De-la-Cruz-Rivera,
and Grigori Sidorov. Generation of Natural
Language Explanations of Rules in an Expert System. Lecture Notes in
Computer Science N 3878, Springer-Verlag, ISSN 0302-9743, 2006, pp. 311-314.
You can find more
information about the papers and about our laboratory on the page of
Alexander Gelbukh. More information about the annual
International Conference on computational linguistics
CICLing (Springer, LNCS series) or about Mexican International Conference on Artificial Intelligence
MICAI (Springer, LNAI series) .