This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/bash | |
| export PATH=/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin | |
| if [[ $ENGINE_CORE_CLICKHOUSE_USER != '' ]]; then | |
| CLARG="--host ${ENGINE_CORE_CLICKHOUSE_HOST} --port ${ENGINE_CORE_CLICKHOUSE_PORT} --user ${ENGINE_CORE_CLICKHOUSE_USER} --password ${ENGINE_CORE_CLICKHOUSE_PASSWORD}" | |
| elif [[ $ENGINE_CORE_CLICKHOUSE_HOST != '' ]]; then | |
| CLARG="--host ${ENGINE_CORE_CLICKHOUSE_HOST} --port ${ENGINE_CORE_CLICKHOUSE_PORT}" | |
| else | |
| CLARG="--host 127.0.0.1 --port 9000" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # tf-idf (term frequency - inverse document frequency) | |
| # 常用于挖掘文章的关键词; | |
| # 在同一篇文章内值大的表示该词在这篇文章中有较高区分度: | |
| # 在该篇文章中反复出现, 而在全部文档中出现较少(逆文档频率) | |
| # 整个语料中值大的, 并无特别的意义, 不适于跨文章比较 | |
| # 词频向量化 | |
| from sklearn.feature_extraction.text import CountVectorizer | |
| # token_pattern 参数设置来指定字符切分字符串: r"(?u)\b[^@]+\b '\\b\\w+\\b' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Format data | |
| out = "\n".join([ | |
| " ".join([ | |
| f"{data[i+1]:02x}{data[i]:02x}" | |
| for i in range(line, min(line + 256, length), 16) | |
| ]) | |
| for line in range(0, length, 256) | |
| ]) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import mmap | |
| import struct | |
| print(struct.unpack('<i', b'\xa0\xcf\x6e\x44')) # 1148112800 | |
| struct.unpack('>i', b'\x95\x6b\x31\x93') # -1788137069 | |
| with open('tmp', 'rb', 0) as file, \ | |
| mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s: | |
| pos = s.find(b'\x64\x65') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def dense_to_one_hot(labels_dense, num_classes): | |
| """Convert class labels from scalars to one-hot vectors.""" | |
| num_labels = labels_dense.shape[0] | |
| index_offset = numpy.arange(num_labels) * num_classes | |
| labels_one_hot = numpy.zeros((num_labels, num_classes)) | |
| labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1 | |
| return labels_one_hot | |
| 看起来 这是一段代码 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # 直接用numpy做数据打散、划分 | |
| data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH) | |
| labels = to_categorical(np.asarray(labels)) | |
| print('Shape of Data Tensor:', data.shape) | |
| print('Shape of Label Tensor:', labels.shape) | |
| indices = np.arange(data.shape[0]) | |
| np.random.shuffle(indices) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| class Metrics(tf.keras.callbacks.Callback): | |
| def on_train_begin(self, logs={}): | |
| self.confusion = [] | |
| self.precision = [] | |
| self.recall = [] | |
| self.f1s = [] | |
| self.kappa = [] | |
| self.auc = [] | |
| def on_epoch_end(self, epoch, logs={}): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import os | |
| import re | |
| import requests | |
| import json | |
| import time | |
| import datetime | |
| import genanki | |
| my_model = genanki.Model( | |
| 201901021920, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| select to_char('2018-04-26 22:23:40', 'yyyyMMdd'); | |
| select date_format('2018-04-26 22:23:40', 'yyyyMMdd'); | |
| select date_format('2018-04-26 22:23:40', 'yyyy-MM-dd HH:mm:ss'); | |
| select to_char('2018-04-26 22:23:40', 'yyyy-MM-dd hh24:mi:ss'); | |
| select date_format(to_unix_timestamp(nvl('2018-04-26 22:23:40', '')), 'yyyyMMdd'); | |
| select from_unixtime(unix_timestamp('20171205 22:23:40','yyyymmdd HH:mm:ss'),'yyyy-mm-dd HH-mm-ss'); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| package topic | |
| import spark.broadcast._ | |
| import spark.SparkContext | |
| import spark.SparkContext._ | |
| import spark.RDD | |
| import spark.storage.StorageLevel | |
| import scala.util.Random | |
| import scala.math.{ sqrt, log, pow, abs, exp, min, max } | |
| import scala.collection.mutable.HashMap |
NewerOlder