mirror of
https://github.com/gryf/ebook-converter.git
synced 2026-03-26 12:33:32 +01:00
Compare commits
14 Commits
76e604c951
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c89fc132b8 | ||
| 8b8a92e9fd | |||
| 6b7f796cfb | |||
| 72d0858ad8 | |||
| 4f548ec882 | |||
|
|
0faa2c0758 | ||
| d37850520b | |||
| 5e56cb8c7a | |||
|
|
084e0d11ce | ||
|
|
4c3c5a9e27 | ||
| c240495c3d | |||
| 53dea56929 | |||
| ef02332465 | |||
| 74abaf0de0 |
1
.gitignore
vendored
1
.gitignore
vendored
@@ -3,3 +3,4 @@ build/
|
|||||||
dist/
|
dist/
|
||||||
sdist/
|
sdist/
|
||||||
*.egg-info/
|
*.egg-info/
|
||||||
|
venv/
|
||||||
|
|||||||
@@ -1,2 +0,0 @@
|
|||||||
graft ebook_converter/data
|
|
||||||
exclude .gitignore
|
|
||||||
61
README.rst
61
README.rst
@@ -2,24 +2,39 @@
|
|||||||
Ebook converter
|
Ebook converter
|
||||||
===============
|
===============
|
||||||
|
|
||||||
This is impudent ripoff of the bits from `Calibre project`_, and is aimed only
|
This is an impudent ripoff of the bits from `Calibre project`_, and is aimed
|
||||||
for converter thing.
|
only for converter thing.
|
||||||
|
|
||||||
My motivation is to have only converter for ebooks run from commandline,
|
|
||||||
without all of those bells and whistles Calibre has, and with cleanest more
|
|
||||||
*pythonic* approach.
|
|
||||||
|
|
||||||
|
My motivation is to have only the converter for ebooks run from the
|
||||||
|
commandline, without all of those bells and whistles Calibre has, and with
|
||||||
|
cleanest more *pythonic* approach.
|
||||||
|
|
||||||
Requirements
|
Requirements
|
||||||
------------
|
------------
|
||||||
|
|
||||||
To build and run ebook converter, you'll need:
|
To build and run ebook converter, you'll need:
|
||||||
|
|
||||||
- Python 3.6 or newer
|
- Python 3.10 or newer
|
||||||
- `Liberation fonts`_
|
- `Liberation fonts`_
|
||||||
- setuptools
|
- setuptools
|
||||||
- ``pdftohtml``, ``pdfinfo`` and ``pdftoppm`` from `poppler`_ project for
|
- ``pdftohtml``, ``pdfinfo`` and ``pdftoppm`` from `poppler`_ project for
|
||||||
conversion from PDF available in ``$PATH``
|
conversion from PDF available in ``$PATH``
|
||||||
|
- ``libxml2-dev`` and ``libxslt-dev`` as dependencies for format manipulation
|
||||||
|
from some of the Calibre code
|
||||||
|
|
||||||
|
and several Python packages:
|
||||||
|
|
||||||
|
- `beautifulsoup4`_
|
||||||
|
- `css-parser`_
|
||||||
|
- `filelock`_
|
||||||
|
- `html2text`_
|
||||||
|
- `html5-parser`_
|
||||||
|
- `msgpack`_
|
||||||
|
- `odfpy`_
|
||||||
|
- `pillow`_
|
||||||
|
- `python-dateutil`_
|
||||||
|
- `setuptools`_
|
||||||
|
- `tinycss`_
|
||||||
|
|
||||||
No Python2 support. Even if Calibre probably still is able to run on Python2, I
|
No Python2 support. Even if Calibre probably still is able to run on Python2, I
|
||||||
do not have an intention to support it.
|
do not have an intention to support it.
|
||||||
@@ -28,9 +43,9 @@ do not have an intention to support it.
|
|||||||
What's supported
|
What's supported
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
To be able to perform some optimization and make converter more reliable and
|
To be able to perform some optimization and make the converter more reliable
|
||||||
easy to use, first I need to remove some of the features, which are totally not
|
and easy to use, first I need to remove some of the features, which are totally
|
||||||
crucial in my opinion, although they might be re-added later, like, for
|
not crucial in my opinion, although they might be re-added later, like, for
|
||||||
instance there is no automatic language translations depending on the locale
|
instance there is no automatic language translations depending on the locale
|
||||||
settings.
|
settings.
|
||||||
|
|
||||||
@@ -44,15 +59,16 @@ Windows is not currently supported, because of the original spaghetti code.
|
|||||||
This may change in the future, after cleanup of mentioned pasta would be
|
This may change in the future, after cleanup of mentioned pasta would be
|
||||||
completed.
|
completed.
|
||||||
|
|
||||||
So called `Kindle periodical` format is not supported, since all we do care are
|
So called *Kindle periodical* format (which `Amazon has`_ `killed`_ anyway back
|
||||||
local files. If there would be downloaded periodical thing (using Calibre for
|
in September 2023) is not supported, since all we do care are local files. If
|
||||||
example), it would be treated as common book.
|
there would be downloaded periodical thing (using Calibre for example), it
|
||||||
|
would be treated as common book.
|
||||||
|
|
||||||
|
|
||||||
Input formats
|
Input formats
|
||||||
~~~~~~~~~~~~~
|
~~~~~~~~~~~~~
|
||||||
|
|
||||||
Currently, I've tested following input formats:
|
Currently, I've tested the following input formats:
|
||||||
|
|
||||||
- Microsoft Word 2007 and up (``docx``)
|
- Microsoft Word 2007 and up (``docx``)
|
||||||
- EPUB, both v2 and v3 (``epub``)
|
- EPUB, both v2 and v3 (``epub``)
|
||||||
@@ -107,7 +123,7 @@ managers), i.e:
|
|||||||
$ . venv/bin/activate
|
$ . venv/bin/activate
|
||||||
(venv) $ git clone https://github.com/gryf/ebook-converter
|
(venv) $ git clone https://github.com/gryf/ebook-converter
|
||||||
(venv) $ cd ebook-converter
|
(venv) $ cd ebook-converter
|
||||||
(venv) $ pip install -r requirements.txt .
|
(venv) $ pip install .
|
||||||
|
|
||||||
Simple as that. And from now on, you can issue converter:
|
Simple as that. And from now on, you can issue converter:
|
||||||
|
|
||||||
@@ -122,9 +138,20 @@ License
|
|||||||
This work is licensed on GPL3 license, like the original work. See LICENSE file
|
This work is licensed on GPL3 license, like the original work. See LICENSE file
|
||||||
for details.
|
for details.
|
||||||
|
|
||||||
|
|
||||||
.. _Calibre project: https://calibre-ebook.com/
|
.. _Calibre project: https://calibre-ebook.com/
|
||||||
.. _pypi: https://pypi.python.org
|
.. _pypi: https://pypi.python.org
|
||||||
.. _Liberation fonts: https://github.com/liberationfonts/liberation-fonts
|
.. _Liberation fonts: https://github.com/liberationfonts/liberation-fonts
|
||||||
.. _Kindle periodical: https://sellercentral.amazon.com/gp/help/external/help.html?itemID=202047960&language=en-US
|
.. _Amazon has: https://goodereader.com/blog/kindle/amazon-will-discontinue-newspaper-and-magazine-subscriptions-in-september
|
||||||
|
.. _killed: https://www.theverge.com/23861370/amazon-kindle-periodicals-unlimited-ended
|
||||||
.. _poppler: https://poppler.freedesktop.org/
|
.. _poppler: https://poppler.freedesktop.org/
|
||||||
|
.. _beautifulsoup4: https://www.crummy.com/software/BeautifulSoup
|
||||||
|
.. _css-parser: https://github.com/ebook-utils/css-parser
|
||||||
|
.. _filelock: https://github.com/tox-dev/py-filelock
|
||||||
|
.. _html2text: https://github.com/Alir3z4/html2text
|
||||||
|
.. _html5-parser: https://html5-parser.readthedocs.io
|
||||||
|
.. _msgpack: https://msgpack.org
|
||||||
|
.. _odfpy: https://github.com/eea/odfpy
|
||||||
|
.. _pillow: https://python-pillow.github.io
|
||||||
|
.. _python-dateutil: https://github.com/dateutil/dateutil
|
||||||
|
.. _setuptools: https://setuptools.pypa.io
|
||||||
|
.. _tinycss: http://tinycss.readthedocs.io
|
||||||
|
|||||||
@@ -32,7 +32,7 @@ def debug():
|
|||||||
# plugins {{{
|
# plugins {{{
|
||||||
|
|
||||||
|
|
||||||
class Plugins(collections.Mapping):
|
class Plugins(collections.abc.Mapping):
|
||||||
|
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
self._plugins = {}
|
self._plugins = {}
|
||||||
|
|||||||
@@ -19,7 +19,7 @@ def is_iterable(obj):
|
|||||||
return hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes))
|
return hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes))
|
||||||
|
|
||||||
|
|
||||||
class OrderedSet(collections.MutableSet):
|
class OrderedSet(collections.abc.MutableSet):
|
||||||
"""
|
"""
|
||||||
An OrderedSet is a custom MutableSet that remembers its order, so that
|
An OrderedSet is a custom MutableSet that remembers its order, so that
|
||||||
every entry has an index that can be looked up.
|
every entry has an index that can be looked up.
|
||||||
|
|||||||
@@ -237,7 +237,7 @@ class HTMLInput(InputFormatPlugin):
|
|||||||
if not os.access(link, os.R_OK):
|
if not os.access(link, os.R_OK):
|
||||||
return link_
|
return link_
|
||||||
if os.path.isdir(link):
|
if os.path.isdir(link):
|
||||||
self.log.warning(link_, 'is a link to a directory. Ignoring.')
|
self.log.warning('%s is a link to a directory. Ignoring.', link_)
|
||||||
return link_
|
return link_
|
||||||
if link not in self.added_resources:
|
if link not in self.added_resources:
|
||||||
bhref = os.path.basename(link)
|
bhref = os.path.basename(link)
|
||||||
|
|||||||
@@ -62,7 +62,7 @@ class PMLOutput(OutputFormatPlugin):
|
|||||||
im = Image.open(io.BytesIO(item.data))
|
im = Image.open(io.BytesIO(item.data))
|
||||||
else:
|
else:
|
||||||
im = Image.open(io.BytesIO(item.data)).convert('P')
|
im = Image.open(io.BytesIO(item.data)).convert('P')
|
||||||
im.thumbnail((300,300), Image.ANTIALIAS)
|
im.thumbnail((300,300), Image.LANCZOS)
|
||||||
|
|
||||||
data = io.BytesIO()
|
data = io.BytesIO()
|
||||||
im.save(data, 'PNG')
|
im.save(data, 'PNG')
|
||||||
|
|||||||
@@ -1012,7 +1012,7 @@ class HTMLConverter(object):
|
|||||||
self.image_memory.append(pt) # Neccessary, trust me ;-)
|
self.image_memory.append(pt) # Neccessary, trust me ;-)
|
||||||
try:
|
try:
|
||||||
im.resize((int(width), int(height)),
|
im.resize((int(width), int(height)),
|
||||||
PILImage.ANTIALIAS).save(pt, encoding)
|
PILImage.LANCZOS).save(pt, encoding)
|
||||||
pt.close()
|
pt.close()
|
||||||
self.scaled_images[path] = pt
|
self.scaled_images[path] = pt
|
||||||
return pt.name
|
return pt.name
|
||||||
@@ -1970,7 +1970,7 @@ def process_file(path, options, logger):
|
|||||||
options.cover = cf.name
|
options.cover = cf.name
|
||||||
|
|
||||||
tim = im.resize((int(0.75 * th), th),
|
tim = im.resize((int(0.75 * th), th),
|
||||||
PILImage.ANTIALIAS).convert('RGB')
|
PILImage.LANCZOS).convert('RGB')
|
||||||
tf = PersistentTemporaryFile(prefix=__appname__ + '_',
|
tf = PersistentTemporaryFile(prefix=__appname__ + '_',
|
||||||
suffix=".jpg")
|
suffix=".jpg")
|
||||||
tf.close()
|
tf.close()
|
||||||
|
|||||||
@@ -145,7 +145,7 @@ class Cell(object):
|
|||||||
continue
|
continue
|
||||||
word = token.split()
|
word = token.split()
|
||||||
word = word[0] if word else ""
|
word = word[0] if word else ""
|
||||||
width = font.getsize(word)[0]
|
width = font.getbbox(word)[2]
|
||||||
if width > mwidth:
|
if width > mwidth:
|
||||||
mwidth = width
|
mwidth = width
|
||||||
return parindent + mwidth + 2
|
return parindent + mwidth + 2
|
||||||
@@ -191,7 +191,7 @@ class Cell(object):
|
|||||||
if (ff, fs) != (ts['fontfacename'], ts['fontsize']):
|
if (ff, fs) != (ts['fontfacename'], ts['fontsize']):
|
||||||
font = get_font(ff, self.pts_to_pixels(fs))
|
font = get_font(ff, self.pts_to_pixels(fs))
|
||||||
for word in token.split():
|
for word in token.split():
|
||||||
width, height = font.getsize(word)
|
_, _, width, height = font.getbbox(word)
|
||||||
left, right, top, bottom = add_word(width, height, left, right, top, bottom, ls, ws)
|
left, right, top, bottom = add_word(width, height, left, right, top, bottom, ls, ws)
|
||||||
return right+3+max(parindent, 10), bottom
|
return right+3+max(parindent, 10), bottom
|
||||||
|
|
||||||
|
|||||||
@@ -452,7 +452,7 @@ class MobiMLizer(object):
|
|||||||
try:
|
try:
|
||||||
item = self.oeb.manifest.hrefs[base.urlnormalize(href)]
|
item = self.oeb.manifest.hrefs[base.urlnormalize(href)]
|
||||||
except:
|
except:
|
||||||
self.oeb.logger.warning('Failed to find image:', href)
|
self.oeb.logger.warning('Failed to find image: %s', href)
|
||||||
else:
|
else:
|
||||||
try:
|
try:
|
||||||
width, height = identify(item.data)[1:]
|
width, height = identify(item.data)[1:]
|
||||||
|
|||||||
@@ -444,8 +444,8 @@ class Indexer(object): # {{{
|
|||||||
if self.is_periodical and self.masthead_offset is None:
|
if self.is_periodical and self.masthead_offset is None:
|
||||||
raise ValueError('Periodicals must have a masthead')
|
raise ValueError('Periodicals must have a masthead')
|
||||||
|
|
||||||
self.log('Generating MOBI index for a %s', 'periodical' if
|
self.log.info('Generating MOBI index for a %s', 'periodical' if
|
||||||
self.is_periodical else 'book')
|
self.is_periodical else 'book')
|
||||||
self.is_flat_periodical = False
|
self.is_flat_periodical = False
|
||||||
if self.is_periodical:
|
if self.is_periodical:
|
||||||
periodical_node = next(iter(oeb.toc))
|
periodical_node = next(iter(oeb.toc))
|
||||||
|
|||||||
@@ -14,13 +14,15 @@ from odf.draw import Frame as odFrame, Image as odImage
|
|||||||
from odf.namespaces import TEXTNS as odTEXTNS
|
from odf.namespaces import TEXTNS as odTEXTNS
|
||||||
|
|
||||||
from ebook_converter.utils import directory
|
from ebook_converter.utils import directory
|
||||||
|
from ebook_converter.ebooks.oeb import parse_utils
|
||||||
from ebook_converter.ebooks.oeb.base import _css_logger
|
from ebook_converter.ebooks.oeb.base import _css_logger
|
||||||
from ebook_converter import polyglot
|
from ebook_converter import polyglot
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
class Extract(ODF2XHTML):
|
class Extract(ODF2XHTML):
|
||||||
|
|
||||||
def extract_pictures(self, zf):
|
def _extract_pictures(self, zf):
|
||||||
if not os.path.exists('Pictures'):
|
if not os.path.exists('Pictures'):
|
||||||
os.makedirs('Pictures')
|
os.makedirs('Pictures')
|
||||||
for name in zf.namelist():
|
for name in zf.namelist():
|
||||||
@@ -30,8 +32,8 @@ class Extract(ODF2XHTML):
|
|||||||
with open(name, 'wb') as f:
|
with open(name, 'wb') as f:
|
||||||
f.write(data)
|
f.write(data)
|
||||||
|
|
||||||
def apply_list_starts(self, root, log):
|
def _apply_list_starts(self, root, log):
|
||||||
if not self.list_starts:
|
if not hasattr(self, "list_starts") or not self.list_starts:
|
||||||
return
|
return
|
||||||
list_starts = frozenset(self.list_starts)
|
list_starts = frozenset(self.list_starts)
|
||||||
for ol in root.xpath('//*[local-name() = "ol" and @class]'):
|
for ol in root.xpath('//*[local-name() = "ol" and @class]'):
|
||||||
@@ -46,7 +48,7 @@ class Extract(ODF2XHTML):
|
|||||||
self.filter_css(root, log)
|
self.filter_css(root, log)
|
||||||
self.extract_css(root, log)
|
self.extract_css(root, log)
|
||||||
self.epubify_markup(root, log)
|
self.epubify_markup(root, log)
|
||||||
self.apply_list_starts(root, log)
|
self._apply_list_starts(root, log)
|
||||||
html = etree.tostring(root, encoding='utf-8', xml_declaration=True)
|
html = etree.tostring(root, encoding='utf-8', xml_declaration=True)
|
||||||
return html
|
return html
|
||||||
|
|
||||||
@@ -84,22 +86,21 @@ class Extract(ODF2XHTML):
|
|||||||
return rule
|
return rule
|
||||||
|
|
||||||
def epubify_markup(self, root, log):
|
def epubify_markup(self, root, log):
|
||||||
from ebook_converter.ebooks.oeb.base import XPath, XHTML
|
|
||||||
# Fix empty title tags
|
# Fix empty title tags
|
||||||
for t in XPath('//h:title')(root):
|
for t in parse_utils.XPath('//h:title')(root):
|
||||||
if not t.text:
|
if not t.text:
|
||||||
t.text = u' '
|
t.text = u' '
|
||||||
# Fix <p><div> constructs as the asinine epubchecker complains
|
# Fix <p><div> constructs as the asinine epubchecker complains
|
||||||
# about them
|
# about them
|
||||||
pdiv = XPath('//h:p/h:div')
|
pdiv = parse_utils.XPath('//h:p/h:div')
|
||||||
for div in pdiv(root):
|
for div in pdiv(root):
|
||||||
div.getparent().tag = XHTML('div')
|
div.getparent().tag = parse_utils.XHTML('div')
|
||||||
|
|
||||||
# Remove the position:relative as it causes problems with some epub
|
# Remove the position:relative as it causes problems with some epub
|
||||||
# renderers. Remove display: block on an image inside a div as it is
|
# renderers. Remove display: block on an image inside a div as it is
|
||||||
# redundant and prevents text-align:center from working in ADE
|
# redundant and prevents text-align:center from working in ADE
|
||||||
# Also ensure that the img is contained in its containing div
|
# Also ensure that the img is contained in its containing div
|
||||||
imgpath = XPath('//h:div/h:img[@style]')
|
imgpath = parse_utils.XPath('//h:div/h:img[@style]')
|
||||||
for img in imgpath(root):
|
for img in imgpath(root):
|
||||||
div = img.getparent()
|
div = img.getparent()
|
||||||
if len(div) == 1:
|
if len(div) == 1:
|
||||||
@@ -119,7 +120,7 @@ class Extract(ODF2XHTML):
|
|||||||
# works in both WebKit and ADE.
|
# works in both WebKit and ADE.
|
||||||
# https://bugs.launchpad.net/bugs/1063207
|
# https://bugs.launchpad.net/bugs/1063207
|
||||||
# https://bugs.launchpad.net/calibre/+bug/859343
|
# https://bugs.launchpad.net/calibre/+bug/859343
|
||||||
imgpath = XPath('descendant::h:div/h:div/h:img')
|
imgpath = parse_utils.XPath('descendant::h:div/h:div/h:img')
|
||||||
for img in imgpath(root):
|
for img in imgpath(root):
|
||||||
div2 = img.getparent()
|
div2 = img.getparent()
|
||||||
div1 = div2.getparent()
|
div1 = div2.getparent()
|
||||||
@@ -297,7 +298,7 @@ class Extract(ODF2XHTML):
|
|||||||
with open('index.xhtml', 'wb') as f:
|
with open('index.xhtml', 'wb') as f:
|
||||||
f.write(polyglot.as_bytes(html))
|
f.write(polyglot.as_bytes(html))
|
||||||
zf = ZipFile(stream, 'r')
|
zf = ZipFile(stream, 'r')
|
||||||
self.extract_pictures(zf)
|
self._extract_pictures(zf)
|
||||||
opf = OPFCreator(os.path.abspath(os.getcwd()), mi)
|
opf = OPFCreator(os.path.abspath(os.getcwd()), mi)
|
||||||
opf.create_manifest([(os.path.abspath(os.path.join(r, f2)), None)
|
opf.create_manifest([(os.path.abspath(os.path.join(r, f2)), None)
|
||||||
for r, _, fnames in os.walk(os.getcwd())
|
for r, _, fnames in os.walk(os.getcwd())
|
||||||
|
|||||||
28
ebook_converter/ebooks/oeb/transforms/unsmarten.py
Normal file
28
ebook_converter/ebooks/oeb/transforms/unsmarten.py
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
__license__ = 'GPL 3'
|
||||||
|
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
|
||||||
|
__docformat__ = 'restructuredtext en'
|
||||||
|
|
||||||
|
from ebook_converter.ebooks.oeb.base import OEB_DOCS, XPath
|
||||||
|
from ebook_converter.ebooks.oeb.parse_utils import barename
|
||||||
|
from ebook_converter.utils.unsmarten import unsmarten_text
|
||||||
|
|
||||||
|
|
||||||
|
class UnsmartenPunctuation:
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.html_tags = XPath('descendant::h:*')
|
||||||
|
|
||||||
|
def unsmarten(self, root):
|
||||||
|
for x in self.html_tags(root):
|
||||||
|
if not barename(x.tag) == 'pre':
|
||||||
|
if getattr(x, 'text', None):
|
||||||
|
x.text = unsmarten_text(x.text)
|
||||||
|
if getattr(x, 'tail', None) and x.tail:
|
||||||
|
x.tail = unsmarten_text(x.tail)
|
||||||
|
|
||||||
|
def __call__(self, oeb, context):
|
||||||
|
bx = XPath('//h:body')
|
||||||
|
for x in oeb.manifest.items:
|
||||||
|
if x.media_type in OEB_DOCS:
|
||||||
|
for body in bx(x.data):
|
||||||
|
self.unsmarten(body)
|
||||||
@@ -4,7 +4,6 @@ import os
|
|||||||
import sys
|
import sys
|
||||||
|
|
||||||
from ebook_converter import logging
|
from ebook_converter import logging
|
||||||
from ebook_converter.customize.conversion import OptionRecommendation
|
|
||||||
from ebook_converter.ebooks.conversion.plumber import Plumber
|
from ebook_converter.ebooks.conversion.plumber import Plumber
|
||||||
|
|
||||||
|
|
||||||
@@ -68,6 +67,7 @@ def run(args):
|
|||||||
|
|
||||||
return 0
|
return 0
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
parser = argparse.ArgumentParser()
|
parser = argparse.ArgumentParser()
|
||||||
parser.add_argument('from_file', help="Input file to be converted")
|
parser.add_argument('from_file', help="Input file to be converted")
|
||||||
@@ -83,5 +83,4 @@ def main():
|
|||||||
|
|
||||||
LOG.set_verbose(args.verbose, args.quiet)
|
LOG.set_verbose(args.verbose, args.quiet)
|
||||||
|
|
||||||
print(args)
|
|
||||||
sys.exit(run(args))
|
sys.exit(run(args))
|
||||||
|
|||||||
40
ebook_converter/utils/unsmarten.py
Normal file
40
ebook_converter/utils/unsmarten.py
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
__license__ = 'GPL 3'
|
||||||
|
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
|
||||||
|
__docformat__ = 'restructuredtext en'
|
||||||
|
|
||||||
|
from ebook_converter.utils.mreplace import MReplace
|
||||||
|
|
||||||
|
_mreplace = MReplace({
|
||||||
|
'–': '--',
|
||||||
|
'–': '--',
|
||||||
|
'–': '--',
|
||||||
|
'—': '---',
|
||||||
|
'—': '---',
|
||||||
|
'—': '---',
|
||||||
|
'…': '...',
|
||||||
|
'…': '...',
|
||||||
|
'…': '...',
|
||||||
|
'“': '"',
|
||||||
|
'”': '"',
|
||||||
|
'„': '"',
|
||||||
|
'″': '"',
|
||||||
|
'“': '"',
|
||||||
|
'”': '"',
|
||||||
|
'„': '"',
|
||||||
|
'″': '"',
|
||||||
|
'“':'"',
|
||||||
|
'”':'"',
|
||||||
|
'„':'"',
|
||||||
|
'″':'"',
|
||||||
|
'‘':"'",
|
||||||
|
'’':"'",
|
||||||
|
'′':"'",
|
||||||
|
'‘':"'",
|
||||||
|
'’':"'",
|
||||||
|
'′':"'",
|
||||||
|
'‘':"'",
|
||||||
|
'’':"'",
|
||||||
|
'′':"'",
|
||||||
|
})
|
||||||
|
|
||||||
|
unsmarten_text = _mreplace.mreplace
|
||||||
52
pyproject.toml
Normal file
52
pyproject.toml
Normal file
@@ -0,0 +1,52 @@
|
|||||||
|
[build-system]
|
||||||
|
requires = ["setuptools >= 77.0"]
|
||||||
|
build-backend = "setuptools.build_meta"
|
||||||
|
|
||||||
|
[project]
|
||||||
|
name = "ebook-converter"
|
||||||
|
version = "4.12.0"
|
||||||
|
requires-python = ">= 3.10"
|
||||||
|
description = "Convert ebook between different formats"
|
||||||
|
dependencies = [
|
||||||
|
"beautifulsoup4>=4.9.3",
|
||||||
|
"css-parser>=1.0.6",
|
||||||
|
"filelock>=3.0.12",
|
||||||
|
"html2text>=2020.1.16",
|
||||||
|
"html5-parser==0.4.12",
|
||||||
|
"msgpack>=1.0.0",
|
||||||
|
"odfpy>=1.4.1",
|
||||||
|
"pillow>=8.0.1",
|
||||||
|
"python-dateutil>=2.8.1",
|
||||||
|
"setuptools>=61.0",
|
||||||
|
"tinycss>=0.4"
|
||||||
|
]
|
||||||
|
readme = "README.rst"
|
||||||
|
authors = [
|
||||||
|
{name = "gryf", email = "gryf73@gmail.com"}
|
||||||
|
]
|
||||||
|
license = "GPL-3.0-or-later"
|
||||||
|
classifiers = [
|
||||||
|
"Environment :: Console",
|
||||||
|
"Intended Audience :: Other Audience",
|
||||||
|
"Operating System :: POSIX :: Linux",
|
||||||
|
"Development Status :: 3 - Alpha",
|
||||||
|
"Programming Language :: Python",
|
||||||
|
"Programming Language :: Python :: 3",
|
||||||
|
"Programming Language :: Python :: 3 :: Only",
|
||||||
|
"Programming Language :: Python :: 3.10",
|
||||||
|
"Programming Language :: Python :: 3.11",
|
||||||
|
"Programming Language :: Python :: 3.12",
|
||||||
|
"Programming Language :: Python :: 3.13"
|
||||||
|
]
|
||||||
|
|
||||||
|
[project.urls]
|
||||||
|
Repository = "https://github.com/gryf/ebook-converter"
|
||||||
|
|
||||||
|
[project.scripts]
|
||||||
|
ebook-converter = "ebook_converter.main:main"
|
||||||
|
|
||||||
|
[tool.setuptools.packages.find]
|
||||||
|
exclude = ["snap"]
|
||||||
|
|
||||||
|
[tool.setuptools.package-data]
|
||||||
|
"*" = ["*.types", "*.css", "*.html", "*.xhtml", "*.xsl", "*.json"]
|
||||||
@@ -1,11 +0,0 @@
|
|||||||
beautifulsoup4>=4.9.3
|
|
||||||
css-parser>=1.0.6
|
|
||||||
filelock>=3.0.12
|
|
||||||
html2text>=2020.1.16
|
|
||||||
html5-parser==0.4.9 --no-binary lxml
|
|
||||||
msgpack>=1.0.0
|
|
||||||
odfpy>=1.4.1
|
|
||||||
pillow>=8.0.1
|
|
||||||
python-dateutil>=2.8.1
|
|
||||||
setuptools>=50.3.2
|
|
||||||
tinycss>=0.4
|
|
||||||
46
setup.cfg
46
setup.cfg
@@ -1,46 +0,0 @@
|
|||||||
[metadata]
|
|
||||||
name = ebook-converter
|
|
||||||
version = 4.12.0
|
|
||||||
summary = Convert ebook between different formats
|
|
||||||
description-file =
|
|
||||||
README.rst
|
|
||||||
author = gryf
|
|
||||||
author-email = gryf73@gmail.com
|
|
||||||
license = GPL3
|
|
||||||
license_file = LICENSE
|
|
||||||
url = https://github.com/gryf/ebook-converter
|
|
||||||
classifier =
|
|
||||||
Environment :: Console
|
|
||||||
Intended Audience :: Other Audience
|
|
||||||
License :: OSI Approved :: GNU General Public License v3 (GPLv3)
|
|
||||||
Operating System :: POSIX :: Linux
|
|
||||||
Development Status :: 3 - Alpha
|
|
||||||
Programming Language :: Python
|
|
||||||
Programming Language :: Python :: 3
|
|
||||||
Programming Language :: Python :: 3 :: Only
|
|
||||||
Programming Language :: Python :: 3.6
|
|
||||||
Programming Language :: Python :: 3.7
|
|
||||||
|
|
||||||
[options]
|
|
||||||
packages = find:
|
|
||||||
include_package_data = True
|
|
||||||
install_requires =
|
|
||||||
filelock
|
|
||||||
python-dateutil
|
|
||||||
lxml
|
|
||||||
css-parser
|
|
||||||
beautifulsoup4
|
|
||||||
tinycss
|
|
||||||
pillow
|
|
||||||
msgpack
|
|
||||||
html5-parser
|
|
||||||
odfpy
|
|
||||||
setuptools
|
|
||||||
html2text
|
|
||||||
|
|
||||||
[options.entry_points]
|
|
||||||
console_scripts =
|
|
||||||
ebook-converter=ebook_converter.main:main
|
|
||||||
|
|
||||||
[options.package_data]
|
|
||||||
* = *.types *.css, *.html, *.xsl
|
|
||||||
Reference in New Issue
Block a user