API documentation¶
This documentation is based on the source code of version 4.0.2 of the chat-archive package. The following modules are available:
chat_archive
chat_archive.backends
chat_archive.backends.gtalk
chat_archive.backends.hangouts
chat_archive.backends.slack
chat_archive.backends.telegram
chat_archive.cli
chat_archive.database
chat_archive.emoji
chat_archive.html
chat_archive.html.keywords
chat_archive.html.redirects
chat_archive.models
chat_archive.profiling
chat_archive.utils
chat_archive
¶
Python API for the chat-archive program.
-
chat_archive.
DEFAULT_ACCOUNT_NAME
= 'default'¶ The name of the default account (a string).
-
class
chat_archive.
ChatArchive
(*args, **kw)[source]¶ Python API for the chat-archive program.
You can set the values of the
data_directory
,database_file
andforce
properties by passing keyword arguments to the class initializer.Here’s an overview of the
ChatArchive
class:-
alembic_directory
¶ The pathname of the directory containing Alembic migration scripts (a string).
The value of this property is computed at runtime based on the value of
__file__
inside of thechat_archive/__init__.py
module.
-
backends
[source]¶ A dictionary of available backends (names and dotted paths).
>>> from chat_archive import ChatArchive >>> archive = ChatArchive() >>> print(archive.backends) {'gtalk': 'chat_archive.backends.gtalk', 'hangouts': 'chat_archive.backends.hangouts', 'slack': 'chat_archive.backends.slack', 'telegram': 'chat_archive.backends.telegram'}
Note
The
backends
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
config
[source]¶ A dictionary with general user defined configuration options.
Note
The
config
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
config_loader
[source]¶ A
ConfigLoader
object that provides access to the configuration.Configuration files are text files in the subset of ini syntax supported by Python’s configparser module. They can be located in the following places:
Directory Main configuration file Modular configuration files /etc /etc/chat-archive.ini /etc/chat-archive.d/*.ini ~ ~/.chat-archive.ini ~/.chat-archive.d/*.ini ~/.config ~/.config/chat-archive.ini ~/.config/chat-archive.d/*.ini The available configuration files are loaded in the order given above, so that user specific configuration files override system wide configuration files.
Note
The
config_loader
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
declarative_base
¶ The base class for declarative models defined using SQLAlchemy.
-
data_directory
[source]¶ The pathname of the directory where data files are stored (a string).
The environment variable
$CHAT_ARCHIVE_DIRECTORY
can be used to set the value of this property. When the environment variable isn’t set the default value~/.local/share/chat-archive
is used (where~
is expanded to the profile directory of the current user).Note
The
data_directory
property is acustom_property
. You can change the value of this property using normal attribute assignment syntax. This property’s value is computed once (the first time it is accessed) and the result is cached. To clear the cached value you can usedel
ordelattr()
.
-
database_file
[source]¶ The absolute pathname of the SQLite database file (a string).
This defaults to
~/.local/share/chat-archive/database.sqlite3
(with~
expanded to the home directory of the current user) based ondata_directory
.Note
The
database_file
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
force
[source]¶ Retry synchronization of conversations where errors were previously encountered (a boolean, defaults to
False
).Note
The
force
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
import_stats
[source]¶ Statistics about objects imported by backends (a
BackendStats
object).Note
The
import_stats
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
num_contacts
¶ The total number of chat contacts in the local archive (a number).
-
num_conversations
¶ The total number of chat conversations in the local archive (a number).
-
num_html_messages
¶ The total number of chat messages with HTML formatting in the local archive (a number).
-
num_messages
¶ The total number of chat messages in the local archive (a number).
-
operator_name
[source]¶ The full name of the person using the chat-archive program (a string or
None
).The value of
operator_name
is used to address the operator of the chat-archive program in first person instead of third person. You can change the value in the configuration file:[chat-archive] operator-name = ...
The default value in case none has been specified in the configuration file is taken from
/etc/passwd
usingget_full_name()
.Note
The
operator_name
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
get_accounts_for_backend
(backend_name)[source]¶ Select the configured and/or previously synchronized account names for the given backend.
-
get_accounts_from_database
(backend_name)[source]¶ Get the names of the accounts that are already in the database for the given backend.
-
get_accounts_from_config
(backend_name)[source]¶ Get the names of the accounts configured for the given backend in the configuration file.
-
initialize_backend
(backend_name, account_name)[source]¶ Load a chat archive backend module.
Parameters: - backend_name – The name of the backend (one of the strings ‘gtalk’, ‘hangouts’, ‘slack’ or ‘telegram’).
- account_name – The name of the account (a string).
Returns: A
ChatArchiveBackend
object.Raises: Exception
when the backend doesn’t define a subclass ofChatArchiveBackend
.
-
is_operator
(contact)[source]¶ Check whether the full name of the given contact matches
operator_name
.
-
load_backend_module
(backend_name)[source]¶ Load a chat archive backend module.
Parameters: backend_name – The name of the backend (one of the strings ‘gtalk’, ‘hangouts’, ‘slack’ or ‘telegram’). Returns: The loaded module.
-
parse_account_expression
(value)[source]¶ Parse a
backend:account
expression.Parameters: value – The backend:account
expression (a string).Returns: A tuple with two values: - The name of a backend (a string).
- The name of an account (a string, possibly empty).
-
search_messages
(keywords)[source]¶ Search the chat messages in the local archive for the given keyword(s).
-
synchronize
(*backends)[source]¶ Download new chat messages.
Parameters: backends – Any positional arguments limit the synchronization to backends whose name matches one of the strings provided as positional arguments. If the name of a backend contains a colon the name is split into two:
- The backend name.
- An account name.
This way one backend can synchronize multiple named accounts into the same local database without causing confusion during synchronization about which conversations, contacts and messages belong to which account.
-
-
class
chat_archive.
BackendStats
[source]¶ Statistics about chat message synchronization backends.
-
__init__
()[source]¶ Initialize a
BackendStats
object.
-
scope
¶ The current scope (a
collections.defaultdict
object).
-
chat_archive.backends
¶
Namespace for chat archive backends.
The following chat archive backends have been implemented so far:
- Google Hangouts:
chat_archive.backends.hangouts
- Google Talk:
chat_archive.backends.gtalk
- Slack:
chat_archive.backends.slack
- Telegram:
chat_archive.backends.telegram
-
class
chat_archive.backends.
ChatArchiveBackend
(**kw)[source]¶ Abstract base class for
chat-archive
backends.When you initialize a
ChatArchiveBackend
object you are required to provide values for theaccount_name
,archive
,backend_name
andstats
properties. You can set the values of theaccount_name
,archive
,backend_name
andstats
properties by passing keyword arguments to the class initializer.Here’s an overview of the
ChatArchiveBackend
class:-
account
[source]¶ The
Account
object corresponding toaccount_name
andbackend_name
.Note
The
account
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
account_name
[source]¶ The name of the chat account that is being synchronized (a string).
The value of
account_name
needs to be set by the caller and is used to “get or create” theaccount
object on demand.Note
The
account_name
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named account_name (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
archive
[source]¶ The
ChatArchive
that is using this backend.Note
The
archive
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named archive (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
backend_name
[source]¶ The name of the chat archive backend (a short alphanumeric string).
The value of
backend_name
is used to “get or create” theaccount
object on demand.Note
The
backend_name
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named backend_name (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
config
[source]¶ The configuration options for this backend and account (a dictionary).
Note
The
config
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
external_id_cache
[source]¶ A dictionary mapping external IDs to
Contact
objects.Note
The
external_id_cache
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
redirect_stripper
[source]¶ An
RedirectStripper
object.Note
The
redirect_stripper
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
session
[source]¶ Shortcut for the
session
property ofarchive
.Note
The
session
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
stats
[source]¶ A
BackendStats
object.Note
The
stats
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named stats (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
find_contact_by_attributes
(attributes)[source]¶ Find a contact based on their external ID, an email address or a telephone number.
Parameters: attributes – A dictionary with any of the following keys:
external_id
(string value)email_addresses
(list of strings)telephone_numbers
(list of strings)
Returns: A Contact
object orNone
.
-
find_contact_by_email_address
(value)[source]¶ Find a contact based on their email address.
Parameters: value – An email address (a string). Returns: A Contact
object orNone
.
-
find_contact_by_external_id
(external_id)[source]¶ Find a contact based on their ‘external ID’.
Parameters: external_id – The external ID (a string). Returns: A Contact
object orNone
.This method uses
external_id_cache
to speed up lookup of contacts by their external ID.
-
find_contact_by_telephone_number
(value)[source]¶ Find a contact based on their telephone number.
Parameters: value – A telephone number (a string). Returns: A Contact
object orNone
.
-
get_or_create_contact
(**attributes)[source]¶ Get or create a contact object.
Parameters: attributes – The names and values of model attributes, used to find existing contacts and create new ones. Returns: A Contact
object.This method serves three distinct purposes:
- Finding existing contacts by their ‘external ID’ or one of their email addresses or telephone numbers.
- Creating new contacts (based on the given attributes).
- Updating existing contacts (based on the given attributes).
Here’s an overview of supported attributes:
- The
external_id
attribute (whose value is expected to be string). - The
full_name
attribute (whose value is expected to be string) is split into separatefirst_name
andlast_name
attributes. - The attributes
email_address
andtelephone_number
(whose value is expected to be string) are converted to their plural formsemail_addresses
andtelephone_numbers
(a list of strings).
-
get_or_create_conversation
(external_id, **attributes)[source]¶ Get or create a
Conversation
object.Parameters: - external_id – The external ID of the conversation (a string).
- attributes – Any optional attributes to set when creating a new conversation.
Returns: Refer to
get_or_create_object()
.
-
get_or_create_message
(conversation, **attributes)[source]¶ Get or create a
Message
object.Parameters: - conversation – The
Conversation
in which the message originated. - attributes – Any optional attributes to set when creating a new message.
Returns: Refer to
get_or_create_object()
.- conversation – The
-
get_or_create_email_address
(email_address)[source]¶ Get or create an
EmailAddress
object.Parameters: email_address – The email address (a string). Returns: An EmailAddress
object.
-
get_or_create_object
(model, required, optional=None)[source]¶ Find an existing object in the local database or create a new object.
Parameters: - model – The model to query.
- required – A dictionary with the key/value pairs that should be used to search for an existing object.
- optional – Any optional attributes to set when creating a new object.
Returns: A tuple with two values:
-
get_or_create_telephone_number
(telephone_number)[source]¶ Get or create a
TelephoneNumber
object.Parameters: telephone_number – The telephone number (a string containing a number). Returns: A TelephoneNumber
object.
-
have_message
(conversation, external_id)[source]¶ Check if a message exists in the local database.
Parameters: - conversation – The
Conversation
that contains the message. - external_id – The unique id of the message (a string).
Returns: - conversation – The
-
pre_process_text
(attributes)[source]¶ Pre-process the text and HTML of a chat message.
Parameters: attributes – A dictionary with Message
attributes.This method works as follows:
- The text is pre-processed using
strip_redirects()
. - The html is pre-processed using
RedirectStripper
. - When the resulting HTML exactly equals the plain text chat message, the html key in attributes is removed.
- The text is pre-processed using
-
chat_archive.backends.gtalk
¶
Synchronization logic for the Google Talk backend of the chat-archive program.
The Google Talk backend uses the IMAP protocol to discover and download the
messages available in the chats_folder
of your
Google Mail account. The following requirements need to be met in order to use
this backend:
- You need to enable IMAP access to your Google Mail account.
- You may need to specifically enable IMAP access to the
chats_folder
(this turned out to be necessary for me).
Before developing this module in June 2018 I had never implemented any IMAP automation [1] so I wasn’t that familiar with the protocol and I didn’t know about message UIDs. The Unique ID in IMAP protocol blog post provided me with some useful details about the semantics of message UIDs.
This backend assumes and requires that the Google Mail servers provide message UIDs that are stable across sessions (this enables discovery of new messages). My testing implies that this is the case, because it seems to work fine! :-)
[1] | Despite operating my own IMAP server for the past ten years, so I was already familiar with IMAP from the perspective of a user as well as server administrator. |
-
chat_archive.backends.gtalk.
FRIENDLY_NAME
= 'Google Talk'¶ A user friendly name for the chat service supported by this backend (a string).
-
chat_archive.backends.gtalk.
NAMESPACED_TAG_PATTERN
= re.compile('^{[^}]+}(\\S+)$')¶ Compiled regular expression to match XML tag names with a name space.
-
chat_archive.backends.gtalk.
BOGUS_EMAIL_PATTERN
= re.compile('^private-chat(-[0-9a-f]+)+@groupchat.google.com$', re.IGNORECASE)¶ Compiled regular expression to recognize private messages in group conversations.
-
class
chat_archive.backends.gtalk.
GoogleTalkBackend
(**kw)[source]¶ The Google Talk backend for the chat-archive program.
This backend supports the following configuration options:
Option Description chats-folder
See chats_folder
.imap-server
See imap_server
.email
The email address used to sign in to your Google Mail account. password-name
The name of a password in ~/.password-store
to use.password
See password
.If you set
password-name
thenpassword
doesn’t have to be set. Ifpassword
norpassword-name
have been set then you will be prompted for your password every time you synchronize.You can set the values of the
chats_folder
andimap_server
properties by passing keyword arguments to the class initializer.Here’s an overview of the
GoogleTalkBackend
class:-
chats_folder
[source]¶ The folder that contains chat message archives (a string, defaults to ‘[Gmail]/Chats’).
Note
The
chats_folder
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
client
[source]¶ An IMAP client connection to
imap_server
.Note
The
client
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
conversation_map
[source]¶ A mapping of conversations.
Note
The
conversation_map
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
imap_server
[source]¶ The domain name of the Google Mail IMAP server (a string, defaults to ‘imap.gmail.com’).
Note
The
imap_server
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
password
[source]¶ The password used to sign in to the Google Mail account (a string).
Note
The
password
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
synchronize
()[source]¶ Download RFC822 encoded Google Talk conversations using IMAP and import the embedded chat messages.
-
parse_singlepart_email
(email)[source]¶ Extract a chat message from a single-part email downloaded from
chats_folder
.
-
parse_multipart_email
(email)[source]¶ Find the
text/xml
payload in an RFC 822 multi-part email message.
-
find_conversation
(*participants)[source]¶ Find a conversation (without an external ID) that involves the given participants.
-
extract_timestamp
(message_node)[source]¶ Extract a timestamp from a
<message>
node.Parameters: message_node – A <message>
node.Returns: A datetime.datetime
object.
-
extract_html
(message_node)[source]¶ Try to extract HTML from a
<message>
node.Parameters: message_node – A <message>
node.Returns: The extracted HTML (a string) or None
.
-
contact_from_jid
(value)[source]¶ Convert a Jabber ID to an email address and use that to find or create a contact.
-
-
class
chat_archive.backends.gtalk.
EmailMessageParser
(**kw)[source]¶ Lazy evaluation of
email.message_from_string()
.When you initialize a
EmailMessageParser
object you are required to provide values for theraw_body
anduid
properties. You can set the values of theraw_body
anduid
properties by passing keyword arguments to the class initializer.Here’s an overview of the
EmailMessageParser
class:Superclass: PropertyManager
Properties: parsed_body
,raw_body
,timestamp
anduid
-
parsed_body
[source]¶ The result of
email.message_from_string()
.Note
The
parsed_body
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
raw_body
[source]¶ The raw message body of the email (a string).
Note
The
raw_body
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named raw_body (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
timestamp
[source]¶ Convert the
Date:
header of the email message to adatetime
object.Note
The
timestamp
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
uid
[source]¶ The UID of the email message.
Note
The
uid
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named uid (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
-
class
chat_archive.backends.gtalk.
LazyXMLFormatter
(node)[source]¶ Lazy evaluation of
xml.etree.ElementTree.tostring()
.-
__init__
(node)[source]¶ Initialize a
LazyXMLFormatter
object.Parameters: node – The XML node to render.
-
chat_archive.backends.hangouts
¶
Synchronization logic for the Google Hangouts backend of the chat-archive program.
-
chat_archive.backends.hangouts.
FRIENDLY_NAME
= 'Google Hangouts'¶ A user friendly name for the chat service supported by this backend (a string).
-
class
chat_archive.backends.hangouts.
HangoutsBackend
(**kw)[source]¶ The Google Hangouts backend for the chat-archive program.
This backend supports the following configuration options:
Option Description email-address
The email address used to sign in to your Google account. password-name
The name of a password in ~/.password-store
to use.password
The password used to sign in to your Google account. If you set
password-name
thenpassword` doesn't have to be set. If ``password
norpassword-name
have been set then you will be prompted for your password every time you synchronize.You can set the values of the
cookie_file
andretry_count
properties by passing keyword arguments to the class initializer.Here’s an overview of the
HangoutsBackend
class:Superclass: ChatArchiveBackend
Public methods: connect_then_sync()
,download_all_contacts()
,download_all_conversations()
,download_all_messages()
,download_conversation()
,download_message_batch()
,get_message_html()
,handle_import_errors()
,is_bogus_user()
,perform_initial_sync()
andsynchronize()
Properties: bogus_user_ids
,client
,cookie_file
andretry_count
-
bogus_user_ids
[source]¶ A
set
of strings with ‘gaia_id’ values of “bogus” users.Note
The
bogus_user_ids
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
The pathname of the
*.json
file with cached credentials (a string).Note
The
cookie_file
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
client
[source]¶ The hangups client object.
Note
The
client
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
retry_count
[source]¶ The number of times that a batch of messages will be requested (a number, defaults to 5).
Note
The
retry_count
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
download_all_messages
(conversation, conversation_in_db, event_id=None)[source]¶ Download the messages in a specific Hangouts conversation.
-
download_message_batch
(conversation, event_id)[source]¶ Try to download a batch of messages (retrying according to
retry_count
).
-
-
class
chat_archive.backends.hangouts.
GoogleAccountCredentials
(**kw)[source]¶ Used to non-interactively provide Google Account credentials to
hangups
.When you initialize a
GoogleAccountCredentials
object you are required to provide values for theemail_address
andpassword
properties. You can set the values of theemail_address
andpassword
properties by passing keyword arguments to the class initializer.Here’s an overview of the
GoogleAccountCredentials
class:Superclass: PropertyManager
Public methods: get_email()
,get_password()
andget_verification_code()
Properties: email_address
andpassword
-
email_address
[source]¶ The Google account email address (a string).
Note
The
email_address
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named email_address (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
password
[source]¶ The Google account password (a string).
Note
The
password
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named password (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
get_email
()[source]¶ Feed the configured
email_address
tohangups
.
-
chat_archive.backends.slack
¶
Synchronization logic for the Slack backend of the chat-archive program.
-
chat_archive.backends.slack.
FRIENDLY_NAME
= 'Slack'¶ A user friendly name for the chat service supported by this backend (a string).
-
class
chat_archive.backends.slack.
SlackBackend
(**kw)[source]¶ Container for the Slack chat archive backend.
You can set the value of the
is_limited
property by passing a keyword argument to the class initializer.Here’s an overview of the
SlackBackend
class:Superclass: ChatArchiveBackend
Public methods: expand_reference_callback()
,get_history()
,import_messages()
,synchronize()
,synchronize_channels()
,synchronize_direct_messages()
andsynchronize_users()
Properties: api_token
,client
,http_session
,is_limited
,mrkdwn_to_html
andspinner
-
api_token
[source]¶ The Slack API token (a string).
Note
The
api_token
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
client
[source]¶ A
slacker.Slacker
instance initialized withapi_token
andhttp_session
.Note
The
client
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
is_limited
[source]¶ Whether result sets have been limited due to the free plan.
Note
The
is_limited
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
mrkdwn_to_html
[source]¶ An
HTMLConverter
object.Note
The
mrkdwn_to_html
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
http_session
[source]¶ A
requests.Session
object used for HTTP connection re-use.Note
The
http_session
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
spinner
[source]¶ An interactive spinner to provide feedback to the user (because the Slack backend is slow).
Note
The
spinner
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
-
class
chat_archive.backends.slack.
HTMLConverter
(expand_reference_callback=None)[source]¶ Convert Slack chat messages from mrkdwn format to HTML.
-
__init__
(expand_reference_callback=None)[source]¶ Initialize an
HTMLConverter
object.
-
__call__
(text)[source]¶ Convert a Slack chat message to HTML.
Parameters: text – The text of a Slack message (a string). Returns: The generated HTML (a string).
-
followed_by_alphanumeric
(input, index, limit)[source]¶ Check if the given position is followed by an alphanumeric character.
-
chat_archive.backends.telegram
¶
Synchronization logic for the Telegram backend of the chat-archive program.
The use of this backend requires the user to register on my.telegram.org/apps to get an api_id
and
api_hash
.
-
chat_archive.backends.telegram.
FRIENDLY_NAME
= 'Telegram'¶ A user friendly name for the chat service supported by this backend (a string).
-
class
chat_archive.backends.telegram.
TelegramBackend
(**kw)[source]¶ Container for the Telegram chat archive backend.
When you initialize a
TelegramBackend
object you are required to provide values for theapi_hash
andapi_id
properties. You can set the values of theapi_hash
,api_id
andsession_file
properties by passing keyword arguments to the class initializer.Here’s an overview of the
TelegramBackend
class:Superclass: ChatArchiveBackend
Public methods: connect_then_sync()
,dialog_to_ignore()
,download_messages()
,is_duplicate_dialog()
,is_group_conversation()
,is_service_dialog()
,perform_initial_sync()
,recipient_to_contact()
,sender_to_contact()
,synchronize()
andupdate_conversation()
Properties: api_hash
,api_id
,client
andsession_file
-
api_hash
[source]¶ The API hash used to connect to the Telegram API (a string).
The value of this property can be configured as follows:
[telegram] api-hash = ...
You can use the
api-hash-name
configuration file option to specify the name of a secret in~/.password-store
instead.Note
The
api_hash
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named api_hash (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
api_id
[source]¶ The API ID used to connect to the Telegram API (an integer).
The value of this property can be configured as follows:
[telegram] api-id = ...
You can use the
api-id-name
configuration file option to specify the name of a secret in~/.password-store
instead.Note
The
api_id
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named api_id (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
client
[source]¶ A
telethon.TelegramClient
object constructed based onapi_id
,:attr:api_hash andsession_file
.Note
The
client
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
session_file
[source]¶ The filename of the session file passed to
telethon.TelegramClient
.Note
The
session_file
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
dialog_to_ignore
(dialog)[source]¶ Check if this conversation should be ignored.
This method exists to exclude two types of conversations:
- The conversation with the “Telegram” user, because I don’t consider the service messages in this conversation to be relevant to my chat archive.
- Group conversations that are being synchronized as part of a different Telegram account.
-
is_duplicate_dialog
(dialog)[source]¶ Check if the given dialog is being synchronized as part of a different Telegram account.
-
is_service_dialog
(dialog)[source]¶ Check if the given dialog is the dialog with the “Telegram” user, containing service messages.
-
connect_then_sync
()[source]¶ Connect to the Telegram API and synchronize the available conversations.
-
download_messages
(dialog, conversation_in_db, min_id=0, max_id=0)[source]¶ Download messages in the given conversation.
-
perform_initial_sync
(dialog, conversation_in_db)[source]¶ Start or resume the initial synchronization.
-
update_conversation
(dialog, conversation_in_db)[source]¶ Download new messages in an existing conversation.
-
chat_archive.cli
¶
Usage: chat-archive [OPTIONS] [COMMAND]
Easy to use offline chat archive that can gather chat message history from Google Talk, Google Hangouts, Slack and Telegram.
Supported commands:
- The ‘sync’ command downloads new chat messages from supported chat services and stores them in the local archive (an SQLite database).
- The ‘search’ command searches the chat messages in the local archive for the given keyword(s) and lists matching messages.
- The ‘list’ command lists all messages in the local archive.
- The ‘stats’ command shows statistics about the local archive.
- The ‘unknown’ command searches for conversations that contain messages from an unknown sender and allows you to enter the name of a new contact to associate with all of the messages from an unknown sender. Conversations involving multiple unknown sender are not supported.
Supported options:
Option | Description |
---|---|
-C , --context=COUNT |
Print COUNT messages of output context during ‘chat-archive search’. This
works similarly to ‘grep -C ’. The default value of COUNT is 3. |
-f , --force |
Retry synchronization of conversations where errors were previously encountered. This option is currently only relevant to the Google Hangouts backend, because I kept getting server errors when synchronizing a few specific conversations and I didn’t want to keep seeing each of those errors during every synchronization run :-). |
-c , --color=CHOICE, --colour=CHOICE |
Specify whether ANSI escape sequences for text and background colors and
text styles are to be used or not, depending on the value of
|
-l , --log-file=LOGFILE |
Save logs at DEBUG verbosity to the filename given by LOGFILE . This option
was added to make it easy to capture the log output of an initial
synchronization that will be downloading thousands of messages. |
-p , --profile=FILENAME |
Enable profiling of the chat-archive application to make it possible to
analyze performance problems. Python profiling data will be saved to
FILENAME every time database changes are committed (making it possible to
inspect the profile while the program is still running). |
-v , --verbose |
Increase logging verbosity (can be repeated). |
-q , --quiet |
Decrease logging verbosity (can be repeated). |
-h , --help |
Show this message and exit. |
-
chat_archive.cli.
FORMATTING_TEMPLATES
= {'conversation_delimiter': '<span style="color: green">{text}</span>', 'conversation_name': '<span style="font-weight: bold; color: #FCE94F">{text}</span>', 'keyword_highlight': '<span style="color: black; background-color: yellow">{text}</span>', 'message_backend': '<span style="color: #C4A000">({text})</span>', 'message_contacts': '<span style="color: blue">{text}</span>', 'message_delimiter': '<span style="color: #555753">{text}</span>', 'message_timestamp': '<span style="color: green">{text}</span>'}¶ The formatting of output, specified as HTML with placeholders.
-
chat_archive.cli.
UNKNOWN_CONTACT_LABEL
= 'Unknown'¶ The label for contacts without a name or email address (a string).
-
class
chat_archive.cli.
UserInterface
(*args, **kw)[source]¶ The Python API for the command line interface for the
chat-archive
program.You can set the values of the
context
,keywords
,timestamp_format
anduse_colors
properties by passing keyword arguments to the class initializer.Here’s an overview of the
UserInterface
class:-
context
[source]¶ The number of messages of output context to print during searches (defaults to 3).
Note
The
context
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
use_colors
[source]¶ Whether to output ANSI escape sequences for text colors and styles (a boolean).
Note
The
use_colors
property is acustom_property
. You can change the value of this property using normal attribute assignment syntax. This property’s value is computed once (the first time it is accessed) and the result is cached. To clear the cached value you can usedel
ordelattr()
.
-
html_to_ansi
[source]¶ An
HTMLConverter
object that usesnormalize_emoji()
as a text pre-processing callback.Note
The
html_to_ansi
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
redirect_stripper
[source]¶ An
RedirectStripper
object.Note
The
redirect_stripper
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
html_to_text
[source]¶ An
HTMLStripper
object.Note
The
html_to_text
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
keyword_highlighter
[source]¶ A
KeywordHighlighter
object based onkeywords
.Note
The
keyword_highlighter
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
keywords
[source]¶ A list of strings with search keywords.
Note
The
keywords
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
timestamp_format
[source]¶ The format of timestamps (defaults to
%Y-%m-%d %H:%M:%S
).Note
The
timestamp_format
property is amutable_property
. You can change the value of this property using normal attribute assignment syntax. To reset it to its default (computed) value you can usedel
ordelattr()
.
-
search_cmd
(arguments)[source]¶ Search the chat messages in the local archive for the given keyword(s).
-
unknown_cmd
(arguments)[source]¶ Find private conversations with messages from an unknown sender and interactively prompt the operator to provide a name for a new contact to associate the messages with.
-
generate_html
(name, text)[source]¶ Generate HTML based on a named format string.
Parameters: - name – The name of an HTML format string in
FORMATTING_TEMPLATES
(a string). - text – The text to interpolate (a string).
Returns: The generated HTML (a string).
This method does not escape the text given to it, in other words it is up to the caller to decide whether embedded HTML is allowed or not.
- name – The name of an HTML format string in
-
normalize_whitespace
(text)[source]¶ Normalize the whitespace in a chat message before rendering on the terminal.
Parameters: text – The chat message text (a string). Returns: The normalized text (a string). This method works as follows:
- First leading and trailing whitespace is stripped from the text.
- When the resulting text consists of a single line, it is processed
using
compact()
and returned. - When the resulting text contains multiple lines the text is prefixed
with a newline character, so that the chat message starts on its own
line. This ensures that messages requiring vertical alignment render
properly (for example a table drawn with
|
and-
characters).
-
render_conversation_summary
(conversation)[source]¶ Render a summary of which conversation a message is part of.
-
prepare_output
(text)[source]¶ Prepare text for rendering on the terminal.
Parameters: text – The HTML text to render (a string). Returns: The rendered text (a string). When
use_colors
isTrue
this method first useskeyword_highlighter
to highlight search matches in the given text and then it converts the string from HTML to ANSI escape sequences usinghtml_to_ansi
.When
use_colors
isFalse
thenhtml_to_text
is used to convert the given HTML to plain text. In this case keyword highlighting is skipped.
-
render_output
(text)[source]¶ Render text on the terminal.
Parameters: text – The HTML text to render (a string). Refer to
prepare_output()
for details about how text is converted from HTML to text with ANSI escape sequences.
-
get_contact_name
(contact)[source]¶ Get a short string describing a contact (preferably their first name, but if that is not available then their email address will have to do). If no useful information is available
UNKNOWN_CONTACT_LABEL
is returned so as to explicitly mark the absence of more information.
-
chat_archive.database
¶
SQLAlchemy based database helpers.
-
class
chat_archive.database.
DatabaseClient
(*args, **kw)[source]¶ Simple wrapper for SQLAlchemy that makes it easy to use with SQLite.
When you initialize a
DatabaseClient
object you are required to provide a value for thedatabase_url
property. You can set the values of thedatabase_file
,database_url
andecho_queries
properties by passing keyword arguments to the class initializer.Here’s an overview of the
DatabaseClient
class:Superclass: ProfileManager
Special methods: __exit__()
and__init__()
Public methods: commit_changes()
Properties: database_engine
,database_file
,database_url
,echo_queries
,session
andsession_factory
-
__init__
(*args, **kw)[source]¶ Initialize a
DatabaseClient
object.Please refer to the
PropertyManager
documentation for details about the handling of arguments.
-
database_engine
[source]¶ An SQLAlchemy database engine connected to
database_url
.Note
The
database_engine
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
database_file
[source]¶ The absolute pathname of an SQLite database file (a string or
None
).Note
The
database_file
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
database_url
[source]¶ A URL that indicates the database dialect and connection arguments to SQLAlchemy (a string).
The value of
database_url
defaults to a URL that instructs SQLAlchemy to use an SQLite 3 database file located at the pathname given bydatabase_file
, but of course you are free to point SQLAlchemy to any supported database server.Note
The
database_url
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named database_url (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
echo_queries
[source]¶ Whether queries should be logged to
sys.stderr
(a boolean, defaults toFalse
).Note
The
echo_queries
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
session
[source]¶ An SQLAlchemy session created by
session_factory
.Note
The
session
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
session_factory
[source]¶ An SQLAlchemy session factory connected to
database_engine
.Note
The
session_factory
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
-
class
chat_archive.database.
SchemaManager
(*args, **kw)[source]¶ Easy to use database schema upgrades based on Alembic.
You can set the values of the
alembic_directory
,auto_create_schema
,auto_upgrade_schema
anddeclarative_base
properties by passing keyword arguments to the class initializer.Here’s an overview of the
SchemaManager
class:Superclass: DatabaseClient
Special methods: __init__()
Public methods: initialize_schema()
andrun_migrations()
Properties: alembic_config
,alembic_directory
,auto_create_schema
,auto_upgrade_schema
,current_schema_revision
,declarative_base
,latest_schema_revision
andschema_up_to_date
-
__init__
(*args, **kw)[source]¶ Initialize a
SchemaManager
object.This method automatically calls
run_migrations()
(andinitialize_schema()
when the database is initially created) to ensure that the database schema is up to date.
-
alembic_config
[source]¶ A minimal Alembic configuration object.
This configuration objects contains two options:
sqlalchemy.url
is set todatabase_url
script_location
is set toalembic_directory
Raises: ValueError
whenalembic_directory
isn’t set.Note
The
alembic_config
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
alembic_directory
[source]¶ The absolute pathname of the directory containing Alembic’s
env.py
file (a string orNone
).Note
The
alembic_directory
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
auto_create_schema
[source]¶ True
if automatic database schema upgrades are enabled,False
otherwise.This defaults to
True
whendeclarative_base
is set,False
otherwise.Note
The
auto_create_schema
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
auto_upgrade_schema
[source]¶ True
if automatic database schema initialization is enabled,False
otherwise.This defaults to
True
whenalembic_directory
is set,False
otherwise.Note
The
auto_upgrade_schema
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
current_schema_revision
[source]¶ The current database schema revision in the database that we’re connected to (a string or
None
).Note
The
current_schema_revision
property is acached_property
. This property’s value is computed once (the first time it is accessed) and the result is cached. To clear the cached value you can usedel
ordelattr()
.
-
declarative_base
[source]¶ The base class for declarative models defined using SQLAlchemy.
Note
The
declarative_base
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
latest_schema_revision
[source]¶ The current schema revision according to Alembic’s migration scripts (a string).
Note
The
latest_schema_revision
property is alazy_property
. This property’s value is computed once (the first time it is accessed) and the result is cached.
-
initialize_schema
()[source]¶ Initialize the database schema using SQLAlchemy.
This method is automatically called when a
SchemaManager
object is created. In order to initialize the database schema thedeclarative_base
property needs to be set, but if it’s not set theninitialize_schema()
won’t complain.
-
run_migrations
()[source]¶ Upgrade the database schema using Alembic.
This method is automatically called when a
SchemaManager
object is created. In order to upgrade the database schema thealembic_directory
property needs to be set, but if it’s not set thenrun_migrations()
won’t complain.
-
-
class
chat_archive.database.
CustomVerbosity
(**kw)[source]¶ Easily customize logging verbosity for a given scope.
This is used by
SchemaManager
to silence Alembic because it’s rather verbose by default, presumably because its primary purpose is to be a command line program and not a library embedded in an application.When you initialize a
CustomVerbosity
object you are required to provide a value for thelevel
property. You can set the values of thelevel
andoriginal_level
properties by passing keyword arguments to the class initializer.Here’s an overview of the
CustomVerbosity
class:Superclass: PropertyManager
Special methods: __enter__()
and__exit__()
Properties: level
andoriginal_level
-
level
[source]¶ The overridden logging verbosity level.
Note
The
level
property is arequired_property
. You are required to provide a value for this property by calling the constructor of the class that defines the property with a keyword argument named level (unless a custom constructor is defined, in this case please refer to the documentation of that constructor). You can change the value of this property using normal attribute assignment syntax.
-
original_level
[source]¶ The original logging verbosity level.
Note
The
original_level
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
chat_archive.emoji
¶
Utility functions to translate between various forms of smilies and emoji.
chat_archive.html
¶
Utility functions for working with the HTML encoded text.
-
chat_archive.html.
BLOCK_TAGS
= ['div', 'p', 'pre']¶ A list of strings with HTML tags that are considered block-level elements. The
HTMLStripper
emits an empty line before and after each block-level element that it encounters.
-
chat_archive.html.
URL_PATTERN
= re.compile('(http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+)')¶ A compiled regular expression pattern to find URLs in text (credit: taken from urlregex.com).
-
chat_archive.html.
html_to_text
(html_text)[source]¶ Convert HTML to plain text.
Parameters: html_text – A fragment of HTML (a string). Returns: The plain text (a string). This function uses the
HTMLStripper
class that builds on top of thehtml.parser.HTMLParser
class in the Python standard library.
-
chat_archive.html.
text_to_html
(text, callback=None)[source]¶ Convert plain text to HTML.
Parameters: - text – A fragment of plain text (a string).
- callback – An optional callback that provides the caller a chance to pre-process text before it is encoded as HTML.
Returns: The HTML encoded text (a string).
This function replaces URLs with
<a href="...">
tags and escapes special characters, that’s it, nothing more.
-
class
chat_archive.html.
HTMLStripper
(*, convert_charrefs=True)[source]¶ A simple HTML to text converter based on
html.parser.HTMLParser
.-
__call__
(data)[source]¶ Convert HTML to text.
Parameters: data – The HTML to convert to text (a string). Returns: The converted text (a string). This method calls
compact_empty_lines()
on the converted text to normalize superfluous empty lines caused by vertical whitespace emitted around block level elements like<div>
,<p>
and<pre>
.
-
handle_charref
(value)[source]¶ Process a decimal or hexadecimal numeric character reference.
Parameters: value – The decimal or hexadecimal value (a string).
-
handle_entityref
(name)[source]¶ Process a named character reference.
Parameters: name – The name of the character reference (a string).
-
reset
()[source]¶ Reset the state of the
HTMLStripper
instance.
-
chat_archive.html.keywords
¶
Utility functions for working with the HTML encoded text.
-
class
chat_archive.html.keywords.
KeywordHighlighter
(*args, **kw)[source]¶ A simple keyword highlighter for HTML based on
html.parser.HTMLParser
.-
__init__
(*args, **kw)[source]¶ Initialize a
KeywordHighlighter
object.Parameters: - keywords – A list of strings with keywords to highlight.
- highlight_template – A template string with the
{text}
placeholder that’s used to highlight keyword matches.
-
chat_archive.html.redirects
¶
Utility functions to pre-process URLs before rendering on a terminal.
In web browsers and chat clients the URLs behind hyperlinks are usually hidden, but in a terminal there’s no “out of band” mechanism to communicate the URL behind a hyperlink - the URL needs to appear literally in the text that is rendered to the terminal.
Given this requirement, I’ve become rather annoyed at Google prefixing every
URL they can get their hands on with https://www.google.com/url?q=…
because
this user hostile “encoding” obscures the intended URL with a lot of fluff that
I don’t care for.
This module contains the expand_url()
function to transform redirect
URLs into their target URL, the strip_redirects()
function to
transform all redirect URLs in a given text and RedirectStripper
to
transform all redirect URLs in a given HTML fragment.
-
chat_archive.html.redirects.
GOOGLE_REDIRECT_URL
= 'www.google.com/url'¶ The base URL of the Google redirect service (a string).
Note that the URL scheme is omitted on purpose, to enable a substring search for the Google redirect service regardless of whether a given URL is using the
http://
orhttps://
scheme.
-
chat_archive.html.redirects.
URL_PATTERN
= re.compile('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+')¶ A compiled regular expression pattern to find URLs in text (credit: taken from urlregex.com).
-
chat_archive.html.redirects.
expand_url
(url)[source]¶ Expand a redirect URL to its target URL.
Parameters: url – The URL to expand (a string). Returns: The expanded URL (a string).
-
chat_archive.html.redirects.
strip_redirects
(text)[source]¶ Expand redirect URLs in the given text.
Parameters: text – The text to process (a string). Returns: The processed text (a string).
-
chat_archive.html.redirects.
strip_redirects_callback
(match)[source]¶ Apply
expand_url()
to the matched URL.
-
class
chat_archive.html.redirects.
RedirectStripper
(*, convert_charrefs=True)[source]¶ Expand redirect URLs embedded in HTML.
This class uses
html.parser.HTMLParser
to parse HTML and expand any redirect URLs that it encounters to their target URL. The__call__()
method provides an easy way to use this functionality.
chat_archive.models
¶
Database models for the chat-archive program based on SQLAlchemy.
The chat_archive.models
module defines the following database models for
the chat-archive program:
-
chat_archive.models.
metadata
= MetaData(bind=None)¶ Define an explicit naming convention to simplify future database migrations.
-
class
chat_archive.models.
Base
(**kwargs)¶ The most base type
-
__init__
(**kwargs)¶ A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
-
chat_archive.models.
address_mapping
= Table('email_address_mapping', MetaData(bind=None), Column('contact_id', Integer(), ForeignKey('contacts.id'), table=<email_address_mapping>), Column('address_id', Integer(), ForeignKey('email_addresses.id'), table=<email_address_mapping>), schema=None)¶ Mapping table for many-to-many relationship between contacts and email addresses.
-
chat_archive.models.
telephone_number_mapping
= Table('telephone_number_mapping', MetaData(bind=None), Column('contact_id', Integer(), ForeignKey('contacts.id'), table=<telephone_number_mapping>), Column('telephone_number_id', Integer(), ForeignKey('telephone_numbers.id'), table=<telephone_number_mapping>), schema=None)¶ Mapping table for many-to-many relationship between contacts and telephone numbers.
-
class
chat_archive.models.
Account
(**kwargs)[source]¶ Database model for chat accounts.
-
id
¶ The primary key of the account (an integer).
-
backend
¶ The name of the backend that manages this account (a string).
-
name
¶ A user defined name for the account (a string).
-
contacts
¶ The contacts that have been imported using this account.
-
conversations
¶ The conversations that have been imported using this account.
-
name_is_significant
¶ True
if the database contains multiple accounts with thisbackend
,False
otherwise.
-
__init__
(**kwargs)¶ A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
-
class
chat_archive.models.
EmailAddress
(**kwargs)[source]¶ Database model for email addresses of chat contacts.
-
id
¶ The primary key of the email address (an integer).
-
value
¶ The email address itself (a string).
-
__repr__
()[source]¶ Render a human friendly representation of an
EmailAddress
object.
-
__str__
()[source]¶ Render a human friendly representation of an
EmailAddress
object.
-
__init__
(**kwargs)¶ A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
-
class
chat_archive.models.
TelephoneNumber
(**kwargs)[source]¶ Database model for telephone numbers of chat contacts.
-
id
¶ The primary key of the telephone number (an integer).
-
value
¶ The telephone number itself (a string).
-
__repr__
()[source]¶ Render a human friendly representation of an
TelephoneNumber
object.
-
__str__
()[source]¶ Render a human friendly representation of an
TelephoneNumber
object.
-
__init__
(**kwargs)¶ A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
-
class
chat_archive.models.
Contact
(**kwargs)[source]¶ Database model for chat contacts.
-
id
¶ The primary key of the contact (an integer).
-
account_id
¶ A foreign key to associate contacts with accounts.
-
email_addresses
¶ The email addresses of this contact.
-
telephone_numbers
¶ The telephone numbers of this contact.
-
sent_messages
¶ The chat messages that were sent by this contact.
-
received_messages
¶ The chat messages that were received by this contact.
-
first_name_is_unambiguous
¶ True
if this first name unambiguously refers to a single contact,False
otherwise.
-
full_name
¶ The full name of the contact (as an SQL expression).
-
__init__
(**kwargs)¶ A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
-
class
chat_archive.models.
Conversation
(**kwargs)[source]¶ Database model for chat conversations.
-
id
¶ The primary key of the conversation (an integer).
-
account_id
¶ A foreign key to associate conversations with accounts.
-
__init__
(**kwargs)¶ A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
is_group_conversation
¶ Whether the conversation is a group conversation (a boolean, defaults to
False
).
-
messages
¶ The chat messages that belong to this conversation.
-
have_unknown_senders
¶ Whether this conversation includes messages from unknown senders (a boolean).
-
-
class
chat_archive.models.
Message
(**kwargs)[source]¶ Database model for chat messages.
Note that the
Message
model doesn’t have a direct relationship to theAccount
model because these two models already have an indirect relationship via theConversation
model (in other words, messages are implicitly namespaced to accounts via conversations).-
__init__
(**kwargs)¶ A simple constructor that allows initialization from kwargs.
Sets attributes on the constructed instance using the names and values in
kwargs
.Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.
-
id
¶ The primary key of the chat message (an integer).
-
conversation_id
¶ A foreign key to associate chat messages with conversations.
-
recipient_id
¶ A foreign key that points to the contact who received this message (an integer or
None
).
-
raw
¶ The raw message text in a backend specific format (a string or
None
).The reason that this field was added to the database schema is because the Slack backend emits chat messages in the somewhat peculiar mrkdwn format which is “almost but not quite” human readable (in my opinion). When the Slack backend imports a new message, the following steps take place:
The original message text is stored without any modifications in the
raw
column.A custom mrkdwn parser developed for the chat-archive program is used to convert
raw
tohtml
(during the import).The value of
html
is used to generate the value oftext
(during the import).If this surprises you: I could have developed a second mrkdwn converter with a different output format, but that’s 150 lines of code I don’t care to repeat and
html_to_text()
works fine for this purpose 😇.
If the custom mrkdwn parser (which is bound to contain bugs) receives bug fixes in a new release of the chat-archive program then
raw
values can be used to regeneratetext
andhtml
values.
-
text
¶ The human readable plain text of the chat message (a string).
This field cannot be
None
(NULL
) and is expected to always contain a nonempty chat message text. This field is used during searches and whenchat-archive --colors=never
is run.
-
html
¶ The formatted text of the chat message (a string or
None
).When a chat message doesn’t contain text formatting or hyperlinks
html
will beNone
andtext
should be used instead. This field will be used whenchat-archive --color=yes
is run.
-
conversation
¶ The conversation that this chat message took place in (a
Conversation
object orNone
).
-
newer_messages
¶ Newer messages in the conversation (not yet sorted!).
-
older_messages
¶ Older messages in the conversation (not yet sorted!).
-
chat_archive.profiling
¶
Easy to use Python code profiling support.
-
class
chat_archive.profiling.
ProfileManager
(*args, **kw)[source]¶ Base class for easy to use Python code profiling support.
This class makes it easy to enable and disable Python code profiling and save the results to a file. You can use it in a
with
statement to guarantee that the profile is saved even when your program is interrupted with Control-C, so when your program is too slow and you’re wondering why you can just restart the program with profiling enabled, wait for it to get slow, give it a while to collect profile statistics and then interrupt it with Control-C.When
profile_file
is set the class initializer method will automatically callenable_profiling()
.You can set the values of the
profile_file
,profiler
andprofiling_enabled
properties by passing keyword arguments to the class initializer.Here’s an overview of the
ProfileManager
class:Superclass: PropertyManager
Special methods: __enter__()
,__exit__()
and__init__()
Public methods: disable_profiling()
,enable_profiling()
andsave_profile()
Properties: can_save_profile
,profile_file
,profiler
andprofiling_enabled
-
__init__
(*args, **kw)[source]¶ Initialize a
ProfileManager
object.Please refer to the
PropertyManager
documentation for details about the handling of arguments.
-
__exit__
(exc_type=None, exc_value=None, traceback=None)[source]¶ Disable code profiling and save the profile statistics when the
with
block ends.
-
can_save_profile
¶ True
ifsave_profile()
is expected to work,False
otherwise.
-
profile_file
[source]¶ The pathname of a file where Python profile statistics should be saved (a string or
None
).Note
The
profile_file
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
profiler
[source]¶ A
profile.Profile
object (ifprofile_file
is set) orNone
.Note
The
profiler
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
profiling_enabled
[source]¶ True
if code profiling is enabled,False
otherwise.Note
The
profiling_enabled
property is awritable_property
. You can change the value of this property using normal attribute assignment syntax.
-
save_profile
(filename=None)[source]¶ Save gathered profile statistics to a file.
Parameters: filename – The pathname of the profile file (a string or None
). Defaults to the value ofprofile_file
.Raises: ValueError
when profiling was never enabled or filename isn’t given andprofile_file
also isn’t set.
-
chat_archive.utils
¶
Utility functions for the chat-archive program.
-
chat_archive.utils.
ensure_directory_exists
(pathname)[source]¶ Create a directory if it doesn’t exist yet.
Parameters: pathname – The pathname of the directory (a string).
-
chat_archive.utils.
get_full_name
()[source]¶ Find the full name of the current user on the local system based on
/etc/passwd
.Returns: A string with the full name of the current user or an empty string when this information is not available.
-
chat_archive.utils.
get_secret
(options, value_option, name_option, description)[source]¶ Get a secret needed to connect to a chat service (like a password or API token).
Parameters: - options – A dictionary with configuration options.
- value_option – The name of the configuration option that defines the value of a secret (a string).
- name_option – The name of the configuration option that defines the
name of a secret in
~/.password-store
(a string). See alsoget_secret_from_store()
. - description – A description of the type of secret that the operator will be prompted for (a string).
Returns: The password (a string).
-
chat_archive.utils.
get_secret_from_store
(name, directory=None)[source]¶ Use
qpass
to get a secret from~/.password-store
.Parameters: - name – The name of a password or a search pattern that matches a single entry in the password store (a string).
- directory – The directory to use (a string, defaults to
~/.password-store
).
Returns: The secret (a string).
Raises: exceptions.ValueError
when the given name doesn’t match any entries or matches multiple entries in the password store.