Python tokenize 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: dalle_pytorch.simple_tokenizer

메소드/함수: tokenize

hotexamples.com에서의 예제들: 4

Python tokenize - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 dalle_pytorch.simple_tokenizer.tokenize에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: train_dalle.py 프로젝트: anirudh2290/smdalle

    def __getitem__(self, ind):
        key = self.keys[ind]
        text_file = self.text_files[key]
        image_file = self.image_files[key]

        image = Image.open(image_file)
        descriptions = text_file.read_text().split('\n')
        descriptions = list(filter(lambda t: len(t) > 0, descriptions))
        description = choice(descriptions)

        tokenized_text = tokenize(description, self.text_len, truncate_text=args.truncate_captions).squeeze(0)

        image_tensor = self.image_tranform(image)
        return tokenized_text, image_tensor

예제 #2

파일 보기

파일: trans_img.py 프로젝트: info-ruc/web21projects

    def __getitem__(self, ind):
        key = self.keys[ind]
        text_file = self.text_files[key]
        image_file = self.image_files[key]

        image = Image.open(image_file)
        descriptions = text_file.read_text().split('\n')
        descriptions = list(filter(lambda t: len(t) > 0, descriptions))
        description = choice(descriptions)

        tokenized_text = tokenize(
            description, self.text_len,
            truncate_text=args.truncate_captions).squeeze(0)

        image_tensor = self.image_tranform(image)
        save_image(image_tensor,
                   os.path.join('../dataset/COCO_256',
                                str(image_file).split('/')[-1]),
                   normalize=True)
        return tokenized_text, image_tensor

예제 #3

파일 보기

파일: generate.py 프로젝트: anirudh2290/smdalle

if vae_params is not None:
    vae = DiscreteVAE(**vae_params)
elif not args.taming:
    vae = OpenAIDiscreteVAE()
else:
    vae = VQGanVAE1024()

dalle = DALLE(vae=vae, **dalle_params).cuda()

dalle.load_state_dict(weights)

# generate images

image_size = vae.image_size

text = tokenize([args.text], dalle.text_seq_len).cuda()

text = repeat(text, '() n -> b n', b=args.num_images)
# create masks
mask = text != 0

outputs = []

for text_chunk, mask in tqdm(zip(text.split(args.batch_size),
                                 mask.split(args.batch_size)),
                             desc='generating images'):
    output = dalle.generate_images(text_chunk,
                                   mask=mask,
                                   filter_thres=args.top_k)
    outputs.append(output)

예제 #4

파일 보기

파일: train_dalle_coco.py 프로젝트: cutz-j/Dalle

wandb.config.depth = DEPTH
wandb.config.heads = HEADS
wandb.config.dim_head = DIM_HEAD

wandb.init(project = 'dalle_train_transformer_coco', resume = RESUME)

# training
for epoch in range(epoch_start, EPOCHS):
    for i, (images, text) in enumerate(dl):
        images = torch.stack(images)
        text_list = []
        for descriptions in text:
            descriptions = list(filter(lambda t: len(t) > 0, descriptions))
            description= choice(descriptions)
            text_list.append(description)
        text = tokenize(text_list).squeeze(0)
        mask = text != 0
        text, images, mask = map(lambda t: t.cuda(), (text, images, mask))
        loss = dalle(text, images, mask = mask, return_loss = True)
        loss = torch.sum(loss)
        loss.backward()
#         clip_grad_norm_(dalle.parameters(), GRAD_CLIP_NORM)

        opt.step()
        opt.zero_grad()

        log = {}

        if i % 10 == 0:
            print(epoch, i, f'loss - {loss.item()}')